Re: [Condor-users] Why does machine reject job for unknown reasons


On 5/15/07, Tony Rippy <trippy@xxxxxxxxxxxxxxxxxx> wrote:
>> condor_q -better-analyze 1082109.0
> 1082109.000:  Run analysis summary.  Of 152 machines,
>       2 are rejected by your job's requirements
>       0 reject your job because of their own requirements
>       0 match but are serving users with a better priority in the pool
>     150 match but reject the job for unknown reasons
>       0 match but will not currently preempt their existing job
>       0 are available to run your job

Hi Alex,

Based on the better-analyze results above, it looks like the startds are
rejecting the job. I would start by checking the startd policy at your
site.  There is more information about this here:


One culprit might be a bad Start _expression_. You can check the Start
_expression_ of your execute nodes by running the following command:

condor_status -startd -format "%s" Machine -format ": Start = %s\n" Start

I got nodeXX.YYY : Start=True

for almost all of the nodes. Does this loom ok?

If the Start _expression_ seems ok, then try looking through the negotiator
log on your central manager. It contains more information about why a
match isn't being made, but you may have to turn on additional logging.
There is more information about logging levels in section 3.3.4 of the
Condor manual:


Good luck!

Tony Rippy
Cycle Computing, LLC

