[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Checkpointing Condor's vanilla universe jobs.

Thanks for your contribution.

I used it to my Condor cluster and it works greatly. Only a suggestion: you 
can trap Condor's signals to force to your programs to make a checkpoint.
When Condor vacates a program, it sends it a signal (killsig) 
Trapping this signal, programs could make a checkpoint before stop.


El Monday 04 February 2008 12:37:43 Mark Calleja escribió:
> Hi,
> In case it's of use or interest to anyone else on this mailing list,
> I've written some notes on how one can use Parrot and the BLCR kernel
> modules to transparently checkpoint Condor's vanilla universe jobs. The
> link is:
> http://www.escience.cam.ac.uk/projects/camgrid/blcr.html
> This is recent/ongoing work, so feedback and/or bug reports back to me
> please.
> Cheers,
> Mark