[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] condor jobs being prempted by local activity on machines?




We're running a fairly small condor cluster (16 dual core cpus, so 32nodes) on machines that are also general purpose compute servers. 
I think we're running into problems where condor is marking a CPU as busy when users are running other (non-condor) processes on the machines.

I think the confusion is because some users are bypassing condor and running their jobs directly on the machines. This causes CpuIsBusy =
TRUE (e.g., condor_status -l s11) and prevents these machines from getting matched to jobs. Meanwhile, condor_q reports these machines
misleadingly as idle.

Is there a way around this?  I've read the Preemption and scheduling sections of the manual, and they all appear to deal with how to handle scheduling WITHIN condor.  Is there a way to make condor's threshold for flagging a CPU as busy significantly higher?  How about increasing the priority that ALL condor jobs run as?


Thanks,
-- Chris Mason