[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor on Windows - failover and checkpointing



A quick followup question to clarify a related point - does this mean
that the standard universe is limited on Windows, or that the standard
universe just doesn't exist at all on Windows (i.e. all executable
jobs in Windows must be vanilla)?

On 4/25/06, Matt Hope <matthew.hope@xxxxxxxxx> wrote:
> On 4/25/06, Shaun J. O'Callaghan <Shaun.OCallaghan@xxxxxxxxxxxx> wrote:
> > When implementing a Condor based system on a Windows network, as
> > checkpointing functionality is missing, does this simply mean when a job
> is
> > interrupted it is either suspended or terminated altogether?
>
> by default yes.
>
> You have the option of getting fancy and trpping the eviction signal,
> responding to it in time by exiting and doing your own checkpointing.
> see posts passim on the list about this.
>
> > Also, does this mean that there's no transaction-style failover in the
> event
> > of a job failure?
>
> Your job is, by default terminated and then becomes available to run
> on another machine (or indeed the same one if it becomes free again)
>
> >
> > Any light that could be shed on these two issues would be greatly
> > appreciated.
>
> searching the list will provide more info - my answers are, I'm
> afraid, brief at the moment but the question has been answered before.
>
> Matt
>
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>