[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] quick question: is periodic vacate possible



Sounds like a reasonable feature request to allow periodic signal delivery to Vanilla Universe jobs. We recently extended kill_sig to VU for checkpointing on exit.

Best,


matt

On 06/17/2010 11:42 AM, Dan Bradley wrote:
> Ian,
> 
> The machine's PREEMPT expression could be used to periodically
> checkpoint vanilla universe jobs that implement some kind of
> self-checkpointing.  You would just want to make sure that WANT_VACATE
> is true for the jobs that get preempted or they will be booted off
> without any chance to save state.
> 
> --Dan
> 
> Smith, Ian wrote:
>> Dear All,
>>
>> Just a very quick question that I can't seem to find an answer for
>> anywhere:
>>
>> Is it possible to periodically vacate jobs in the same way as
>> they can be periodically held and removed ?
>>
>> The reason I ask is that I've been building checkpointing
>> into some of our vanilla universe jobs and it would
>> be useful if these could be vacated say once every
>> few hours so that the checkpoint file get stored in
>> the $(SPOOL). Some of the jobs can run for days
>> and with few students around the campus at present
>> they are unlikely to get evicted by user logins. This
>> means that the output can get lost if the startd crashes for some
>> reason*, loosing several days
>> work.
>>
>> regards,
>>
>> -ian.
>>
>> * I've noticed several connection failures with long running jobs  
>> and I'm still not sure of the reason although someone turning
>>   off an execute host running a job is obviously one !
>>
>> --------------------------------------------
>> Dr Ian C. Smith,
>> Advanced Research Computing (e-Science) Team,
>> The University of Liverpool
>> Computing Services Department
>>
>> _______________________________________________
>> Condor-users mailing list
>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/condor-users/
>>   
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/