[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Upgrade Debian package without restarting Condor?
- Date: Mon, 12 Jan 2009 10:30:11 -0600
- From: Matthew Farrellee <matt@xxxxxxxxxx>
- Subject: Re: [Condor-users] Upgrade Debian package without restarting Condor?
Steffen Grunewald wrote:
On Mon, Jan 12, 2009 at 12:52:14PM +0100, Steffen Grunewald wrote:
I'm looking for someone who might know how to upgrade Condor on a Debian
(or Ubuntu) system without restarting Condor (since that would have the
potential to kill running jobs - which is counterproductive, and condor_master
is supposed to tell when binaries have been updated).
Hmmm. After removing "invoke.rc.d condor restart --configure" from the
postinst script that had been used before, when testing an upgrade on a
single node I get the following:
1/12 17:12:09 /usr/sbin/condor_master was modified, restarting /usr/sbin/condor_master.
1/12 17:12:09 Sent SIGTERM to STARTD (pid 3632)
1/12 17:12:09 Sent SIGTERM to CKPT_SERVER (pid 3633)
1/12 17:12:09 DaemonCore: pid 3632 exited with status 0, invoking reaper 1 <Daemons::AllReaper()>
1/12 17:12:09 The STARTD (pid 3632) exited with status 0
1/12 17:12:09 DaemonCore: return from reaper for pid 3632
1/12 17:12:09 DaemonCore: pid 3633 exited with status 0, invoking reaper 1 <Daemons::AllReaper()>
1/12 17:12:09 The CKPT_SERVER (pid 3633) exited with status 0
1/12 17:12:09 All daemons are gone. Restarting.
1/12 17:12:09 Restarting master in 120 seconds.
1/12 17:12:09 DaemonCore: return from reaper for pid 3633
1/12 17:14:09 Doing exec( "/usr/sbin/condor_master" )
Looking at the process table, condor_master is still the old one, only the
daemons are new.
Since the node on which I ran the test was idle I didn't find out what would
happen with actually running user jobs, and now I'm a bit worried whether
there's a difference at all...
Would someone please shed some light on this?
A SIGTERM to the Startd is a peaceful shutdown, e.g. it should let jobs
complete before exiting.
The master is re-exec'ing itself. So it'll be the same PID, but a