[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] limit execution time for a job



Dominik Pich wrote:
Hi,
I studied the manual/tutorials and found a lot to describe _where_ and when to run jobs. I know about requirements and rank and later I also found the priority option to categorize jobs based on 'importance'.

Now what I want to do is:
I want to express a maximum execution time (processor time) for a given priority
e.g. a job with a -10 prio has only 10s processor time


I am curious to know why you would want such a policy.

Usually folks would setup Condor to preempt (interrupt) a lower priority job if there is a higher priority job waiting, and then give the machine to the higher priority job.

This sort of setup prevents the undesirable situation where you have plenty of idle/available machines but still kick off jobs because they hit some "maximum execution time".

Is that feasible with condor?

Sure. Although I really recommend you state *what you really want* at a higher level. Condor is more flexible/powerful than many other schedulers where everything revolves around time slots and job execution limits. Are you really looking for max execution time based on job priority as you say, or are you really looking for a good way to interrupt lower priority jobs with higher priority jobs if they are available? If the latter, there are better ways to achieve it with Condor.

However, if this is indeed what you really want, there are two ways to do it - either in the job's policy or in the machine's policy. If you "control"/own the machines, and want the machines to enforce this policy, put limits like this into the machine (aka condor_startd - the condor_startd is the daemon that enforces the policy of a machine's owner) policy. To read how to do this, see
  http://www.cs.wisc.edu/condor/manual/v6.8/3_5Startd_Policy.html
in the manual, but the short answer is you could place something like following into the condor_config file on the machine(s) where you want such a limit enforced:

   START = True
   WANT_SUSPEND = False
   PREEMPT = IfThenElse((JobPrio == -10) && \
                        ((CurrentTime - JobStart) > 10), True, False)
   KILL            = $(ActivityTimer) > 10


The above machine policy says this machine is always available to start jobs, that it never wants to suspend jobs, that it should try to gracefully shutdown (preempt) jobs after 10 seconds of execution if the job priority is -10, and that it should hardkill the job if it doesn't go away in 10 seconds after being asked nicely.

If you want to enforce this in the job's policy, check out the PERIODIC_REMOVE expression in your condor_submit job description file.

Hope this helps point you in the right direction,
regards,
Todd



Maybe some's already done something like that!?

Regards,
Dominik


_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at either
https://lists.cs.wisc.edu/archive/condor-users/
http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR