[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Effect of condor_off -peaceful on schedd


Currently, for all daemons other than the startd, condor_off -peaceful is equivalent to condor_off -graceful. Therefore, in the case of the schedd, this means that it will force jobs to vacate, which is not what you want. We are planning to improve support for -peaceful shutdown/restart of the schedd, but instead of just waiting for all jobs to finish, we are hoping to take advantage of restartable starter-shadow connections when job_lease_duration is being used. Currently, if you wanted to quickly reboot the schedd and your jobs are using job_lease_duration, you would have to kill -9 the schedd and its shadows, and then restart the schedd, which will then start up shadows to reconnect to the existing jobs that were running. Note, however, that restartable shadow-starter connections do not currently work for jobs that are running in a remote pool via condor flocking--another item on the TODO list.


On Apr 17, 2006, at 10:58 AM, Steven Timm wrote:

We have, in the past, used condor_off -peaceful to shut down a number
of worker nodes in our cluster for maintenance and it has done what
we expected it to do, namely, keep the startd from starting any more
jobs and finish the one that is already running.

My question is--what if we did

condor_off -all -peaceful

on the head node that is, in our configuration, running schedd,
collector, and negotiator?  What would be the result?

It would be nice to get the schedd in a state such that it
would let any currently-running jobs finish, and record that
they had finished, but not let any new ones start.  Would
that be the effect of condor_off -peaceful on a schedd, or would
the effects be totally unpredictable?

Steve Timm

Steven C. Timm, Ph.D (630) 840-8525 timm@xxxxxxxx http://home.fnal.gov/~timm/ Fermilab Computing Div/Core Support Services Dept./Scientific Computing Section
Assistant Group Leader, Farms and Clustered Systems Group
Lead of Computing Farms Team
Condor-users mailing list