[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Condor-users] Job is getting rerun instead of terminated



> Jobs running for more than 12 hours are to be thrown out of the queue.
The
> users will not add a job ad like that, because they just forget. How
could
> I do this with config files? Is there something like "defaults for
> submitting jobs", that I could change?

Yes - it's called "SUBMIT_EXPRS".  See

http://www.cs.wisc.edu/condor/manual/v6.7.9/3_3Configuration.html#sec:Su
bmit-Config-File-Entries

Mike Yoder
Principal Member of Technical Staff
Direct : +1.408.321.9000
Fax    : +1.408.321.9030
Mobile : +1.408.497.7597
yoderm@xxxxxxxxxx

Optena Corporation
2860 Zanker Road, Suite 201
San Jose, CA 95134
http://www.optena.com


> On Fri, 22 Jul 2005, Jaime Frey wrote:
> 
> > On Jul 22, 2005, at 5:26 AM, Andreas Vetter wrote:
> >
> > > we have a setup that is meant to termminate all jobs after 12
hours
> > > runtime. Most jobs are vanilla universe. But sometimes there are
jobs
> > > that
> > > are evicted after 12 hours and then started again on other nodes.
The
> > > user
> > > finally killed the job with condor_rm. Other jobs are terminated
after
> 12
> > > hours as expected.
> > >
> > > Attached is part 3 of our global condor config and the users log
for
> the
> > > restarting job.
> > >
> > > Did I miss something?
> >
> > When an execute machine kills a job for running too long, the schedd
> doesn't
> > consider the job complete. It thinks that the execute machine wasn't
> willing
> > to let the job run long enough and it now needs to find another
machine
> that
> > will let the job run to completion. When a job leaves the queue is
> controlled
> > by the job ad in the schedd.
> >
> > If you want your jobs to leave the queue when they run longer than
12
> hours,
> > you need to set periodic_remove in the job ads. If you want the jobs
to
> stay
> > in the queue but not get rerun, you need to modify the startd's
> requirements
> > to not run jobs that previously ran for more than 12 hours.
> >
> >
+----------------------------------+---------------------------------+
> > |    Jaime Frey            |  Public Split on Whether        |
> > |      jfrey@xxxxxxxxxxx         |  Bush Is a Divider              |
> > | http://www.cs.wisc.edu/~jfrey/  |         -- CNN Scrolling Banner
|
> >
+----------------------------------+---------------------------------+
> >
> >
> >
> >
> 
> --
>  Andreas Vetter
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users