[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] How to solve/debug no matching?



Carsten Aulbert wrote:
Hi Matt

Matthew Farrellee wrote:

ENABLE_BACKFILL=TRUE
BACKFILL_SYSTEM=BOINC
START_BACKFILL=\$(StateTimer)>(1*\$(MINUTE))
EVICT_BACKFILL=FALSE
Here's a shot: EVICT_BACKFILL = FALSE would suggest once in the backfill state the machine will always stay in the backfill state.


Well, since these are dedicated compute nodes I thought Condor will kill
them anyway if work is around? I think it was working in the past with
this setting.

According to this email
https://lists.cs.wisc.edu/archive/condor-users/2006-April/msg00092.shtml
this setting is right, but what shall I put there?

According to that email, if you have LoadAvg (or something that references it) in your START, chances are the backfill job will trick your policy into not running any other jobs -- load will be high thanks to BOINC.

It seems like BOINC-generated load should be considered part of CondorLoadAvg. It could be argued that it is just Condor utilizing the CPU of the node, not a job or a user at the console.


If I set EVICT_BACKFILL=TRUE the backfill will never start,
$(MachineBusy) is also not really making much sense, does it?

MachineBusy is defined in your global condor_config file: MachineBusy = ($(CPUBusy) || $(KeyboardBusy))

Also,

NonCondorLoadAvg  = (LoadAvg - CondorLoadAvg)
CPUBusy        = ($(NonCondorLoadAvg) >= $(HighLoad))


You could try using the Job Hooks instead of backfill to keep your machines busy when they aren't running any normal jobs.

Best,


matt