[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Deadline scheduling condor_off



Random idea: "condor_off -startd timestamp" for scheduled service outages, e.g., preventive maintenance or limiting execution time on expensive resources.

The idea being to provide Condor with a hard scheduling constraint and let it optimize system utilization until then. Initially this might be as simple as an immediate "condor_off -startd -peaceful" followed by "condor_off -startd -force-graceful" at T = timestamp - graceful_timeout - miscellaneous. However, if there are current (or future) mechanisms for jobs to signal their ability to checkpoint (and the time that takes) or their maximum run time, Condor could then make informed scheduling decisions about what jobs to start in the interim before time timstamp to maximize goodput while minimizing badput. Additionally, Condor could analyze recent job run times from jobs in a cluster or set to decide what jobs to schedule before T = timestamp.

Thanks.

--
Stuart Anderson
sba@xxxxxxxxxxx