[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] max wall time?

On 3/2/06, Andrew Zahn <azahn@xxxxxxxxxxxxxxxx> wrote:
> Hi,
> does anyone know  a simple way to set a max wall time or upper limit on
> how long all jobs can run? once a runtime reaches a specified time it
> needs to be removed regardless of status.

use periodic_remove or periodic_hold constraints.

obviously this is a user based setting but I then get a scheduled job
to spot users who aren't setting it and go and 'encourage' them to
sort it out.

I find a very good setting is
periodic_hold = JobRunCount > X
where is is something in the hundreds (though the number obviously
depends on your pool usage) for a predominantly vanilla/non
checkpointing universe.

This stops a job that isn't getting anywhere (no condor_store_cred run
since a password change for example) from kicking too many other jobs.