[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] some jobs die at daemon restart, some don't



Hi Christoph,

We avoid restarting condor on worker nodes with running jobs - from our experience all jobs die when you do this (although we haven't checked this carefully with the most recent versions of condor).

Regards,
Andrew.

________________________________________
From: HTCondor-users [htcondor-users-bounces@xxxxxxxxxxx] on behalf of Beyer, Christoph [christoph.beyer@xxxxxxx]
Sent: Thursday, February 04, 2016 1:29 PM
To: htcondor-users
Subject: [HTCondor-users] some jobs die at daemon restart, some don't

Hi,

when restarting the condor daemon on a workernode most of the time the jobs on that node survive sometimes though the job dies, I presume that is the case when the job is actually writing to the shadow (?)

Is there a timeout or something alike that I can increase to keep all jobs happy during a daemon restart ?

cheers
        ~chris


--
/*   Christoph Beyer     |   Office: Building 2b / 23     *\
 *   DESY                |    Phone: 040-8998-2317        *
 *   - IT -              |      Fax: 040-8994-2317        *
\*   22603 Hamburg       |     http://www.desy.de         */
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/