[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] condor_rm on lots of jobs





Ian Stokes-Rees wrote:
FWIW, the jobs are generally managed through DAGMan, and have POST
scripts associated with them.  I would have thought that a queued job
that gets "rm'ed" isn't going to go through the POST script, but I could
be wrong.
This is a case where you *don't* want to manually remove the node
jobs.  If you remove the node jobs, and DAGMan starts noticing those
events before it gets removed itself, it *will* run POST scripts for
the removed jobs, which might be part of your load problem.

We just removed the DAGMan job, but we still got a load of 15000 (yes, 3
zeros).  We're running 7.4.

How many instances of DAGMan were in the queue?  O(10)?  O(1000)? ...

--Dan