[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] File last modification time or job last write() attribute?
- Date: Wed, 25 May 2016 23:05:34 -0400
- From: Michael V Pelletier <Michael.V.Pelletier@xxxxxxxxxxxx>
- Subject: Re: [HTCondor-users] File last modification time or job last write() attribute?
From: MIRON LIVNY <miron@xxxxxxxxxxx>
Date: 05/25/2016 02:29 PM
> Can you tell us how you plan to use this information. In other words
> do you care about when the last write took place?"
Sure, professor: in some scenarios the only reasonable
course of action is
to keep trying until the bitter, bitter end. And so
if timing out is not an
option, then one doesn't put a timeout function into
the code in the first
I suppose it's in the same realm as Michelle Craft's
on slide nine, with its eight-hour deadline:
The trick is detecting the asymptote as early as possible
And so if a log file is supposed to have data written
to it for each
time slice, for example, and nothing has appeared
in it for far longer than
you'd expect a single time slice ought to take, then
you can conclude that
you're not going to make any further forward progress
and some action should
be taken. Since the job won't terminate itself for
reasons, it falls to a
periodic_hold or _remove _expression_ which can use
that last-write time number
compared to CurrentTime in order to trigger, imposing
an external timeout.