[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] rooster on linux, take 3




After testing to see which machines match the job, the negotiator sorts the matching machines and chooses the most desirable one. If it chooses an offline machine, it should inform the collector and update MachineLastMatchTime. Can you confirm from your negotiator log whether it is choosing the offline machine or not? From the log you posted, I can only see that the offline machine was selected as a candidate, not whether it was actually chosen.

The default sort order is:

1. Prefer slots that are idle.  (NEGOTIATOR_PRE_JOB_RANK)
2. In case of tie, prefer slots based on the job's rank expression.
3. In case of tie, prefer slots that are not offline (NEGOTIATOR_POST_JOB_RANK)

--Dan

On 11/28/11 6:04 PM, Dimitri Maziuk wrote:
On 11/28/2011 12:44 PM, Dan Bradley wrote:

"Registering attempt to match offline machine<host.name>  by<user.name>."
Not exactly:

11/28/11 17:40:19 Testing whether the job matches with the following
machine ad:
Machine = "robin.bmrb.wisc.edu"
...
Offline = true
...
11/28/11 17:40:19 Job 963337.0 does match with slot1@xxxxxxxxxxxxxxxxxxxx
-----------------------------------------------

However, there's no MachineLastMatchTime:

$ condor_status -l robin | grep -i lastmatch
Unhibernate = MY.MachineLastMatchTime =!= undefined
(4 times: 1 for each core presumably?)

Also, the machine did wake up when I submitted my 10K+ jobs. It ran for
5 minutes or so, then it went back to sleep and isn't waking up again.
At least that part is consistently reproducible...

Thanks