[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] defrag on condor 7.4?

On 5/11/2013 1:10 PM, Russ Poyner wrote:
We have a pool running 7.4.4 with dynamic/partitionable slots. A user is
wanting to run jobs with request_cpus = 8 and I'm concerned that our
machines will be too fragmented to accept the jobs.

Clearly the correct solution is to upgrade to a current condor version
with condor_defrag. That's scheduled for June, after the looming
research deadline. Meanwhile I wonder if there is a way to simulate
defrag type behavior, by hand if needed on a 7.4.4 pool.

Perhaps something as simple as doing a :
   condor_restart -peaceful [hostname]
on some number of execute machines periodically ? IIRC, even in v7.4.x telling an execute node to restart w/ the peaceful option will result in that node refusing to accept new jobs but allowing currently running jobs to complete, and only once all jobs are completed will the startd exit (at which point the master will restart everything).