[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] 'condor_off -peaceful' kills jobs



Hi Steve,

Thanks for the info. However, I just did a test as you suggested, it still doesn't work although the command claimed it sent peaceful shutdown command:

# condor_off -peaceful
Sent "Set-Peaceful-Shutdown" command to local startd
Sent "Kill-All-Daemons-Peacefully" command to local master

but in log:

..........
02/24/16 09:49:45 slot2_1: State change: claim-activation protocol successful
02/24/16 09:49:45 slot2_1: Changing activity: Idle -> Busy
02/24/16 09:50:09 Got SIGTERM. Performing graceful shutdown.
02/24/16 09:50:09 shutdown graceful
02/24/16 09:50:09 Cron: Killing all jobs
02/24/16 09:50:09 Cron: Killing all jobs
02/24/16 09:50:09 Killing job mips
02/24/16 09:50:09 Killing job kflops
02/24/16 09:50:09 slot2_1: Changing activity: Busy -> Retiring
02/24/16 09:50:09 slot2_1: State change: claim retirement ended/expired
.............

and the job got killed.

Di


On 24/02/16 06:44 AM, Steven Timm wrote:
My experience is that it will work if you leave off the
-daemon startd

(at least in 8.0.x and 8.2.x).
Plus you have to be on the host itself.. if you do the command from the
central manager, i.e.

condor_off -peaceful <hostname>

it doesn't work either.

We were recently reminded that there is also the condor_drain command
out there nowadays which has a slightly different functionality but may
do what you want.

Steve Timm


On Tue, 23 Feb 2016, Di Qing wrote:

Hi All,

From manual of condor_off command, it is mentioned that '-peaceful'
option wait indefinitely for jobs to finish. However, when I tested
'condor_off -peaceful' command, it killed jobs immediately. The
command I used is as follows:

# condor_off -peaceful -daemon startd
Sent "Set-Peaceful-Shutdown" command to local startd
Sent "Kill-Daemon-Peacefully" command to local master

it claimed that it sent peaceful shutdown command, but in fact it
killed jobs immediately and in startd logs it logged 'shutdown graceful':
................
02/23/16 15:44:21 slot2_1: State change: claim-activation protocol
successful
02/23/16 15:44:21 slot2_1: Changing activity: Idle -> Busy
02/23/16 15:45:57 Got SIGTERM. Performing graceful shutdown.
02/23/16 15:45:57 shutdown graceful
02/23/16 15:45:57 Cron: Killing all jobs
02/23/16 15:45:57 Cron: Killing all jobs
02/23/16 15:45:57 Killing job mips
02/23/16 15:45:57 Killing job kflops
02/23/16 15:45:57 slot2_1: Changing activity: Busy -> Retiring
02/23/16 15:45:57 slot2_1: State change: claim retirement ended/expired
.................

The condor version we are using is 8.4.2.

Thanks,

Di
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


------------------------------------------------------------------
Steven C. Timm, Ph.D  (630) 840-8525
timm@xxxxxxxx  http://home.fnal.gov/~timm/
Office: Feynman Computing Center 243
Fermilab Scientific Computing Division,
Scientific Computing Facilities Quadrant.,
Experimental Computing Facilities Dept.,


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/