[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Ways to limit blackhole machines?

A while ago on this list, someone mentioned that there
was a way to detect if one machine was causing a lot of jobs
to be held, and keep it from running any more.  For instance,
if every job that was running on the machine was getting
held immediately with "error on starter vmx@nodename"

I believe it involves a change to the START expression of the machine.
Does anyone have this syntax handy?


Steve Timm

Steven C. Timm, Ph.D  (630) 840-8525
timm@xxxxxxxx  http://home.fnal.gov/~timm/
Fermilab Computing Division, Scientific Computing Facilities,
Grid Facilities Department, FermiGrid Services Group, Assistant Group Leader.