[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Upgrade Debian package without restarting Condor?



Steffen Grunewald wrote:
On Mon, Jan 12, 2009 at 12:52:14PM +0100, Steffen Grunewald wrote:
I'm looking for someone who might know how to upgrade Condor on a Debian
(or Ubuntu) system without restarting Condor (since that would have the
potential to kill running jobs - which is counterproductive, and condor_master
is supposed to tell when binaries have been updated).
Suggestions welcome!

Hmmm. After removing "invoke.rc.d condor restart --configure" from the
postinst script that had been used before, when testing an upgrade on a
single node I get the following:

1/12 17:12:09 /usr/sbin/condor_master was modified, restarting /usr/sbin/condor_master.
1/12 17:12:09 Sent SIGTERM to STARTD (pid 3632)
1/12 17:12:09 Sent SIGTERM to CKPT_SERVER (pid 3633)
1/12 17:12:09 DaemonCore: pid 3632 exited with status 0, invoking reaper 1 <Daemons::AllReaper()>
1/12 17:12:09 The STARTD (pid 3632) exited with status 0
1/12 17:12:09 DaemonCore: return from reaper for pid 3632
1/12 17:12:09 DaemonCore: pid 3633 exited with status 0, invoking reaper 1 <Daemons::AllReaper()>
1/12 17:12:09 The CKPT_SERVER (pid 3633) exited with status 0
1/12 17:12:09 All daemons are gone.  Restarting.
1/12 17:12:09 Restarting master in 120 seconds.
1/12 17:12:09 DaemonCore: return from reaper for pid 3633
1/12 17:14:09 Doing exec( "/usr/sbin/condor_master" )

Looking at the process table, condor_master is still the old one, only the
daemons are new.
Since the node on which I ran the test was idle I didn't find out what would
happen with actually running user jobs, and now I'm a bit worried whether
there's a difference at all...

Would someone please shed some light on this?

Steffen

A SIGTERM to the Startd is a peaceful shutdown, e.g. it should let jobs complete before exiting.

The master is re-exec'ing itself. So it'll be the same PID, but a different executable.

Best,


matt