[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Interesting use of backfill





Erik Paulson wrote:

On Fri, Jun 22, 2007 at 02:39:10PM +0300, Mark Silberstein wrote:
The only problem was to kill startd when the actual program termintates.
So at the moment we make the program that is started by startd kill startd
right before the termination. But it's awkward. If we had a parameter in
startd, which would trigger it to suicide when backfill executable dies
itself - this would be fantastic. We have no problems with fixing that
ourselves, but we thought maybe this parameter can be added in 6.9.x
series.

Any other ideas would be appreciated!

You could try STARTD_NOCLAIM_SHUTDOWN, which is the number of seconds the
startd will stay unclaimed before shutting itself down. It may work
with "Backfill" jobs as the "Claim", but I'm not sure if it does (or, actually, even if it should! :)
I checked the code. Backfill does count as a claim for this purpose, so your idea should work.

Another idea is to use the new DAEMON_SHUTDOWN expression, but you might have to wait for 6.9.4. In 6.9.4, there will be attributes in the startd ad advertising how much time it has spent in each state+activity, so you would be able to configure the shutdown expression to stop the startd if some backfill time has been used, but the slot is no longer in the backfill state.

Another idea is to use MaxJobRetirementTime or condor_off -peaceful to let
the startd run until the job boundry. Again, I'm not sure that the startd
treats the "backfill" as a job, so it may not work either.

I checked the code for this too. MaxJobRetirementTime doesn't apply to backfill, so this one won't work.

--Dan