[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] quick question: is periodic vacate possible



Hi Dan,

Would this not periodically vacate all jobs though. Ideally
I'd like to just vacate those jobs that save their own checkpoints.

I've set WANT_VACATE=TRUE on all of the execute hosts - is
it possible to set this on a per job basis ?

thanks,

-ian.

> -----Original Message-----
> From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-
> bounces@xxxxxxxxxxx] On Behalf Of Dan Bradley
> Sent: 17 June 2010 16:43
> To: Condor-Users Mail List
> Subject: Re: [Condor-users] quick question: is periodic vacate possible
> 
> Ian,
> 
> The machine's PREEMPT expression could be used to periodically
> checkpoint vanilla universe jobs that implement some kind of
> self-checkpointing.  You would just want to make sure that WANT_VACATE
> is true for the jobs that get preempted or they will be booted off
> without any chance to save state.
> 
> --Dan
> 
> Smith, Ian wrote:
> > Dear All,
> >
> > Just a very quick question that I can't seem to find an answer for
> > anywhere:
> >
> > Is it possible to periodically vacate jobs in the same way as
> > they can be periodically held and removed ?
> >
> > The reason I ask is that I've been building checkpointing
> > into some of our vanilla universe jobs and it would
> > be useful if these could be vacated say once every
> > few hours so that the checkpoint file get stored in
> > the $(SPOOL). Some of the jobs can run for days
> > and with few students around the campus at present
> > they are unlikely to get evicted by user logins. This
> > means that the output can get lost if the startd
> > crashes for some reason*, loosing several days
> > work.
> >
> > regards,
> >
> > -ian.
> >
> > * I've noticed several connection failures with long running jobs
> >   and I'm still not sure of the reason although someone turning
> >   off an execute host running a job is obviously one !
> >
> > --------------------------------------------
> > Dr Ian C. Smith,
> > Advanced Research Computing (e-Science) Team,
> > The University of Liverpool
> > Computing Services Department
> >
> > _______________________________________________
> > Condor-users mailing list
> > To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> > subject: Unsubscribe
> > You can also unsubscribe by visiting
> > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> >
> > The archives can be found at:
> > https://lists.cs.wisc.edu/archive/condor-users/
> >
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/