[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] checkpointing in windows



On 2/24/06, Andrassy, Neil <AndrassyNP@xxxxxxxxxxx> wrote:
> Is there a way, if your application is able to generate it's own checkpoint
> data (a restart file for example) to 'simulate' checkpointing under windows?
> i.e. Get Condor to periodically copy selected application generated files
> and retrieve them when execution restarts on a new machine? Perhaps a flag
> in the job ClassAd to say RESTART_FILES = restart1.ext,restart2.ext

short answer yes.

(slightly) longer answer:

http://www.cs.wisc.edu/condor/manual/v6.6.10/6_2Microsoft_Windows.html#SECTION00723000000000000000
This section alludes to it - really there needs a section saying "how
to get non automatic checkpointing working on windows"

If you search these archives for "windows checkpointing" or "vanilla
universe checkpointing" you will get some posts passim
"Checkpointing in vanilla" has a few bit by myself
Also an older thread titled "file transfer problems with vanilla job"
will get you some more specifics

Apologies for not giving you a more fulsome answer I am in a bit of a
hurry at the mo.

Matt