[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] quick questions



WOW, neat trick!

Thanks,
rob

On Feb 28, 2008, at 11:42 AM, Dan Bradley wrote:

As if by magic:

http://www.cs.wisc.edu/condor/manual/v7.0/3_3Configuration.html#14088

Apparently, those configuration settings were only ever documented in
the version history, and by now, the manual no longer contains the
version history from back when they were added, so they weren't
documented at all.

I provided some real-life examples that I hope are also helpful.

--Dan

Robert E. Parrott wrote:

Where can I find more info on "SYSTEM_PERIODIC_HOLD" ?

I;m not finding it in the condor manual.

thanks,
rob

On Feb 27, 2008, at 1:48 PM, Dan Bradley wrote:



Robert E. Parrott wrote:



A couple of quick one-offs on configs:

1) How does a user specify a max runtime on a job from their submit
file?






What do you want to achieve: putting the job on hold if it runs for
too
long?  Or simply specifying the maximum amount of time the job
should be
given to finish before being preempted by higher priority jobs?




Here, I'd like to have users be able to specify the max total run
time for a parallel job before it's ended.




Is this different from putting the job on hold if it runs too
long?  I'm
not aware of any other option specific to the parallel job universe.



But I would be very interested in the answers to the other cases you pose as well. I assume for the first you want to use a PERIODIC_HOLD
expression, but the second would be useful as well.




Yes, periodic_hold in the job submit file can be used to put a job on
hold if it runs too long.  An alternative would be to have users
insert
a custom attribute that specifies maximum runtime and then you
would use
SYSTEM_PERIODIC_HOLD in the config file to put jobs on hold that run
longer than expected.  Example:

in submit file:
+MaxRunTime = 3600

in config file:
SYSTEM_PERIODIC_HOLD = JobStatus == 2 && MaxRunTime =!= UNDEFINED &&
(RemoteWallClockTime - CumulativeSuspensionTime) > MaxRunTime


The other thing I alluded to was a way to specify the amount of time a
job should be allowed to run without interruption.  This doesn't
really
apply to the parallel universe, because parallel universe jobs should
always run without preemption.

in submit file:
# this should finish in less than one hour
# if it does not, it is ok for it to be preempted
MaxJobRetirementTime = 3600

in execute machine config file:
# allow up to 2 days max of uninterrupted time for jobs
MaxJobRetirementTime = 3600*24*2

I hope that helps you.

--Dan

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx
with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/



==========================
Robert E. Parrott, Ph.D. (Phys. '06)
Associate Director, Grid and
        Supercomputing Platforms
Project Manager, CrimsonGrid Initiative
Harvard University Sch. of Eng. and App. Sci.
Maxwell-Dworkin  211,
33 Oxford St.
Cambridge, MA 02138
(617)-495-5045




_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/


_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/

==========================
Robert E. Parrott, Ph.D. (Phys. '06)
Associate Director, Grid and
        Supercomputing Platforms
Project Manager, CrimsonGrid Initiative
Harvard University Sch. of Eng. and App. Sci.
Maxwell-Dworkin  211,
33 Oxford St.
Cambridge, MA 02138
(617)-495-5045