[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] LogonUser(condor-reuse-slot1, ... ) failed with status 1385



The error you are encountering, as Gunjit suggests, may be due to Condor or the slot users not having sufficient privileges to allow Condor to login the condor slot accounts as an interactive users (thus the error ERROR_LOGON_TYPE_NOT_GRANTED).

To enable this across all your machines have your IT department allow the Condor user accounts to have the SE_INTERACTIVE_LOGON_NAME (i.e. "SeInteractiveLogonRight").  This should fix the problem for you.

So of the point in here may help:
http://support.microsoft.com/kb/257346

I believe this problem cropped up when we first moved away from using batch login account, in favour of interactive ones.  We did this for several reasons, but primarily it was because the batch login accounts three major limitations: batch users had trouble running certain types of applications that were becoming popular at the time, also because newer versions of Windows did not allow batch users to run batch files (via the cmd.exe tool), and finally, because batch accounts were found not to work when using Samba as a Primary Domain Controller.  Batch accounts were initially used because it was thought they would allow Condor to process a login faster.  It was found, however, that in practice the latency incurred from using interactive accounts was far outweigh by its benefits.  The only reason I could think of that Condor should optionally be using batch accounts, is when it is running a (very) large number of short lived jobs.  Even then, the overhead of spawning a starter process will dwarf the login time an interactive login would invite.

Regards,
-B

On 2010-07-01, at 10:58 AM, kschwarz@xxxxxxxxxxxxxx wrote:

> Hi,
> 
> I am loosing communication between SHADOW and STARTER daemons. Looking at 
> their log files that I paste below:
> 
> The ShadowLog shows:
> 
> 7/1 11:54:41 ******************************************************
> 7/1 11:54:41 ** condor_shadow (CONDOR_SHADOW) STARTING UP
> 7/1 11:54:41 ** C:\Condor\bin\condor_shadow.exe
> 7/1 11:54:41 ** SubsystemInfo: name=SHADOW type=SHADOW(6) class=DAEMON(1)
> 7/1 11:54:41 ** Configuration: subsystem:SHADOW local:<NONE> class:DAEMON
> 7/1 11:54:41 ** $CondorVersion: 7.2.4 Jun 15 2009 BuildID: 159529 $
> 7/1 11:54:41 ** $CondorPlatform: INTEL-WINNT50 $
> 7/1 11:54:41 ** PID = 1580
> 7/1 11:54:41 ** Log last touched 6/30 13:30:28
> 7/1 11:54:41 ******************************************************
> 7/1 11:54:41 Using config source: C:\condor\condor_config
> 7/1 11:54:41 Using local config sources: 
> 7/1 11:54:41    C:\condor/condor_config.local
> <snip>
> 7/1 11:54:41 DaemonCore: Command Socket at <10.3.28.8:45848>
> 7/1 11:54:41 Initializing a VANILLA shadow for job 9.0
> 7/1 11:54:42 (9.0) (1580): Request to run on slot1@xxxxxxxxxxxxxxxxxxxx 
> <10.11.3.133:10882> was ACCEPTED
> 7/1 11:54:56 (9.0) (1580): condor_read(): recv() returned -1, errno = 
> 10054, assuming failure reading 5 bytes from <10.11.3.133:10882>.
> 7/1 11:54:56 (9.0) (1580): IO: Failed to read packet header
> 7/1 11:54:56 (9.0) (1580): Can no longer talk to condor_starter 
> <10.11.3.133:10882>
> 7/1 11:54:56 (9.0) (1580): Trying to reconnect to disconnected job
> 7/1 11:54:56 (9.0) (1580): LastJobLeaseRenewal: 1277996096 Thu Jul 01 
> 11:54:56 2010
> 7/1 11:54:56 (9.0) (1580): JobLeaseDuration: 1200 seconds
> 7/1 11:54:56 (9.0) (1580): JobLeaseDuration remaining: 1200
> 7/1 11:54:56 (9.0) (1580): Attempting to locate disconnected starter
> 7/1 11:54:56 (9.0) (1580): locateStarter(): ClaimId 
> (<10.11.3.133:10882>#1277995944#1#9a100f3b140949e336ed2e5322947dd25e5971fa) 
> and GlobalJobId ( PC284419.corp.ad.emb#9.0#1277996003 ) not found
> 7/1 11:54:56 (9.0) (1580): Reconnect FAILED: Job not found at execution 
> machine
> 7/1 11:54:56 (9.0) (1580): **** condor_shadow (condor_SHADOW) pid 1580 
> EXITING WITH STATUS 107
> 
> The StarterLog.slot1 shows:
> 
> 7/1 11:54:55 KEYCACHE: created: 00B87160
> 7/1 11:54:55 ******************************************************
> 7/1 11:54:55 ** condor_starter (CONDOR_STARTER) STARTING UP
> 7/1 11:54:55 ** C:\Condor\bin\condor_starter.exe
> 7/1 11:54:55 ** SubsystemInfo: name=STARTER type=STARTER(8) 
> class=DAEMON(1)
> 7/1 11:54:55 ** Configuration: subsystem:STARTER local:<NONE> class:DAEMON
> 7/1 11:54:55 ** $CondorVersion: 7.2.4 Jun 15 2009 BuildID: 159529 $
> 7/1 11:54:55 ** $CondorPlatform: INTEL-WINNT50 $
> 7/1 11:54:55 ** PID = 5276
> 7/1 11:54:55 ** Log last touched time unavailable (No such file or 
> directory)
> 7/1 11:54:55 ******************************************************
> 7/1 11:54:55 Using config source: C:\Condor\condor_config
> 7/1 11:54:55 Using local config sources: 
> 7/1 11:54:55    C:\condor/condor_config.local
> <snip>
> 7/1 11:54:55 DaemonCore: Command Socket at <10.11.3.133:7721>
> 7/1 11:54:55 GLEXEC_JOB not supported on this platform; ignoring
> 7/1 11:54:55 Setting resource limits not implemented!
> 7/1 11:54:55 Communicating with shadow <10.3.28.8:45848>
> 7/1 11:54:55 Submitting machine is "10-3-28-8.sjk.emb"
> 7/1 11:54:55 setting the orig job name in starter
> 7/1 11:54:55 setting the orig job iwd in starter
> 7/1 11:54:56 LogonUser(condor-reuse-slot1, ... ) failed with status 1385
> 7/1 11:54:56 ERROR "Failed to create a user nobody" at line 442 in file 
> ..\src\condor_c++_util\uids.cpp
> 7/1 11:54:56 ERROR "LocalUserLog::logStarterError() called before init()" 
> at line 222 in file ..\src\condor_starter.V6.1\local_user_log.cpp
> 
> The LogonUser message above shows a "Logon failure: the user has not been 
> granted the requested logon type at this computer."
> 
> Our IT administrator people are implementing a new security baseline on 
> the machines and seems that condor-reuse-slotn has no rights to run the 
> job anymore. There are some rights that are not being granted for local 
> accounts.
> Does one have any suggestion to fix it?
> 
> Regards, Klaus
> 
> This message is intended solely for the use of its addressee and may 
> contain privileged or confidential information. All information contained 
> herein shall be treated as confidential and shall not be disclosed to any 
> third party without Embraer?s prior written approval. If you are not the 
> addressee you should not distribute, copy or file this message. In this 
> case, please notify the sender and destroy its contents immediately.
> Esta mensagem é para uso exclusivo de seu destinatário e pode conter 
> informações privilegiadas e confidenciais. Todas as informações aqui 
> contidas devem ser tratadas como confidenciais e não devem ser divulgadas 
> a terceiros sem o prévio consentimento por escrito da Embraer. Se você não 
> é o destinatário não deve distribuir, copiar ou arquivar a mensagem. Neste 
> caso, por favor, notifique o remetente da mesma e destrua imediatamente a 
> mensagem._______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/