[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Negotiator problem? Jobs not assigned to idlemachines.

At 12:09 PM 8/1/2006, Rick Lan wrote:

Setting NEGOTIATOR_CONSIDER_PREEMPTION = True seems to work. However, at
first jobs would begin to run, then some of the jobs would get stuck as
"match but reject the job for unknown reasons" for about 15mins and then
start running. Now it is stuck for 2 hours. I've attach SchedLog and
NegotiatorLog below.

8/1 22:06:02       Rejected 93.0 malikr@xxxxxxxx <>: no
match found

Above line is strange in that previous jobs have identical submit file
except file paths.

Obvious question, but you have (had?) "Unclaimed" machines in your pool according to condor_status?

Try doing "condor_status -state" and see how long these Unclaimed machines have been Unclaimed (by looking at the StateTime column). Perhaps these machines are being claimed and run jobs, but then immediately toss the job off? Thus whenever you look, you typically see the machine Unclaimed and the job idle? This could happen if, for example, the stdin file specified does not exist or something like that.


Todd Tannenbaum                       University of Wisconsin-Madison
Condor Project Research               Department of Computer Sciences
tannenba@xxxxxxxxxxx                  1210 W. Dayton St. Rm #4257
http://www.cs.wisc.edu/~tannenba      Madison, WI 53706-1685
Phone: (608) 263-7132  FAX: (608) 262-9777