[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] snapshot of a job



The standard answers to this are 
1. VM Universe (yes folks use it for pause+resume)  
2. Standard Universe (Several limitations)
3. DMTCP (not yet supported, but it looks like we are heading down that
road)  

Each has trade offs so you may want to evaluate cost/benefits based on
your problem domain &&|| users. 

Cheers,
Tim

On Fri, 2011-07-08 at 06:30 -0400, Mag Gam wrote:
> I understand the crash and recovery feature of condor in vanilla
> universe if broken. However, I have heard people using vmware or xen
> to achieve this since its much easier.  Has anyone done this before?
> 
> Basically, I run jobs which take several days to run and it would be
> nice to have to take a periodic snapshot and when the job fails I just
> play the latest VM image to I don´t loose all my results.
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/