[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Account condor-reuse-slot1_10 creation failed! (err=2202)



Unfortunately you can't change the "condor-reuse-" prefix for the dynamic usernames.
And you can't control the 1_10 for the slot number.  But you CAN set "slot" to something else.

if you add

STARTD_RESOURCE_PREFIX=s

Then your dynamic slots will be called

 s1_10@machine-name

and the dynamic usernames will be

  condor-reuse-s1_10
  12345678901234567890

which fits.

-tj

n 2/3/2014 2:46 AM, Alexey Smirnov wrote:
Hi!

We are running a windows condor cluster configured with dynamic slots. Recently we added to the pool a new 16-cores machine and suddenly faced problems! Condor is unable to run more than 9 jobs on this new node! Here is what the StarterLog.slot1_10 is saying (the same with all slots upper than 10):

StarterLog.slot1_10
===========================
02/01/14 10:05:38 Communicating with shadow <###.###.###.###:61259>
02/01/14 10:05:38 Submitting machine is "###.###.###.###"
02/01/14 10:05:38 setting the orig job name in starter
02/01/14 10:05:38 setting the orig job iwd in starter
02/01/14 10:05:38 Account condor-reuse-slot1_10 creation failed! (err=2202)
02/01/14 10:05:38 update_psid() failed after account creation!
02/01/14 10:05:38 ERROR "Failed to create a user nobody" at line 610 in file c:\condor\execute\dir_29540\userdir\src\condor_utils\uids.cpp
02/01/14 10:05:38 ShutdownFast all jobs.
02/01/14 10:05:38 condor_read() failed: recv(fd=1460) returned -1, errno = 10054 , reading 5 bytes from <147.125.99.159:61298>.
02/01/14 10:05:38 IO: Failed to read packet header
02/01/14 10:05:38 Error disabling account condor-reuse-slot1_10 (INVALID PARAMETER)


The problem source is more or less clear. We are not using a "run_as_owner" mode and therefore condor creates a temporal account on the running node. The account name has a template "condor-reuse-slot<X>". Windows limits the account name to 20 characters and therefore the name "condor-reuse-slot1_10" cannot be created. This seems to be a bug in condor!

(In condor mail list there was already a similar question - https://www-auth.cs.wisc.edu/lists/htcondor-users/2012-July/msg00064.shtml... Unfortunately unanswered...) 

Any ideas how to proceed?

Thanks,
Alexey



_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/