[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] JobLeaseDuration



On Jun 19, 2006, at 2:03 PM, Cliff Padgett wrote:

Does this work for the mpi universe?  I queued up some mpi jobs with a
JobLeaseDuration = 3600 then on the dedicated scheduler tried the
condor_off -fast -schedd followed by a condor restart. however, all the
jobs appeared to have been killed and restarted.

When you issue a condor_off -fast, the schedd still kills all of its running jobs. What makes it fast is that standard universe jobs aren't checkpointed. If you killed the schedd and its shadows, then you should see the JobLeaseDuration and job reconnect do their thing.

It would be nice to have a condor_off option that leaves the jobs running, but we don't have it at the moment.

+--------------------------------+-----------------------------------+
|           Jaime Frey           | I used to be a heavy gambler.     |
|       jfrey@xxxxxxxxxxx        | But now I just make mental bets.  |
| http://www.cs.wisc.edu/~jfrey/ | That's how I lost my mind.        |
+--------------------------------+-----------------------------------+