Re: [Condor-users] Jobs rejected by machines

To diagnose why machine requirements will not match to a job, I recommend getting a full dump of the machine ClassAd:

condor_status -long <machine name>

Then look at the Start expression and at all the expressions that it refers to.


On 8/22/10 7:39 PM, Jolly, Ben wrote:

I am trying to run a bunch of jobs in the 'vanilla' universe with a 'stock' Condor setup on around 15-20 dual core machines (so 30-40 slots depending how many are connected).  The problem is that all the jobs sit 'Idle', and when I try a status check the vast majority of slots are sitting 'Unclaimed' and 'Idle'.  There are a few with 'Owner' status but no others are 'Claimed' or 'Busy'.  My jobs are the only ones in the queue.  We have looked through the config file on each of the client machines and had a quick play with the 'START = ' line, changing the value to 'true' instead of '$(UWCS_START)'.  This worked brilliantly except that Condor then ran all my jobs all the time, regardless of whether or not a user was logged on to the machine (about half of the machines in the pool are used by people during the day, the other half are dedicated).  When we changed the START variable back work ceased on all the machines and they are now all 'Unclaimed'.  A 'condor_q -analyze' com!
  mand gives the result '34 reject your job because of their own requirements' (there are only 34 slots available).

Does anyone know what could be causing this?  I guess I should mention that we are running Condor version 7.5.2 (built Apr 19 2010 - 232940) under Windows (Platform: INTEL-WINNT50)  Our UWCS_START is:

# Only start jobs if:
# 1) the keyboard has been idle long enough, AND
# 2) the load average is low enough OR the machine is currently
#    running a Condor job
# (NOTE: Condor will only run 1 job at a time on a given resource.
# The reasons Condor might consider running a different job while
# already running one are machine Rank (defined above), and user
# priorities.)
UWCS_START	= ( (KeyboardIdle>  $(StartIdleTime)) \
                     &&  ( $(CPUIdle) || \
                          (State != "Unclaimed"&&  State != "Owner")) )


StartIdleTime		= 15 * $(MINUTE)
CPUIdle			= ($(NonCondorLoadAvg)<= $(BackgroundLoad))
