Subject: Re: [HTCondor-users] Not running Parallel-universe jobs?
The -analyze and -better-analyze options
still show that machines which don't have the DedicatedScheduler attribute
set as "available" to run a parallel universe job:
008.000: Run analysis summary.
Of 6 machines,
0 are rejected
by your job's requirements
0 reject your job
because of their own requirements
0 match and are
already running your jobs
0 match but are
serving other users
6 are available
to run your job
This is what shows up when I submit
a 3-machine_count parallel job to a static-slot pool which only has two
slots with DedicatedScheduler set. If you set a job requirement of ( !
isUndefined(DedicatedScheduler) ), or some sort of more sophisticated _expression_
to match the dedicated scheduler to which the job was submitted, then the
analyze will show you a clearer picture:
009.000: Run analysis summary.
Of 6 machines,
4 are rejected
by your job's requirements
0 reject your job
because of their own requirements
0 match and are
already running your jobs
0 match but are
serving other users
2 are available
to run your job
(Feature request for 8.2.10?)
Check section 2.9.2 in the 8.2.9 manual
for more details about the DedicatedScheduler attribute. A parallel job
will only run on a slot with the DedicatedScheduler attribute - maybe some
of the other machines lost that in the wake of your recent disruption if
you're expecting the job to run on the six available machines.
As to the current job you're waiting
on, once those 41 slots which are running open up, then your job will be
dispatched.
Michael V. Pelletier
IT Program Execution
Principal Engineer
978.858.9681 (5-9681) NOTE NEW NUMBER
339.293.9149 cell
339.645.8614 fax
michael.v.pelletier@xxxxxxxxxxxx