[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Antwort: Re: Fault Behaviour of Condor



On 8/3/06, thomas.t.hoppe@xxxxxxxxxxxxxxxxxxx
<thomas.t.hoppe@xxxxxxxxxxxxxxxxxxx> wrote:


Hi Matt,

For 1.) and 2.) the behaviour is just fine! -- I've also followed the discussion
regarding disk failure.
Maybe the documentation should state more clearly that
Condors default behaviour is to restart a job in case if a fault
(I might have overseen that).

I guess that is kind of percieved as the 'proper' default behaviour
for a job queue system.
Note that by using the periodic_* and on_exit_* expressions on
submission you can change this

Regarding 3.)
I gave it over an hour I think.

What is your job lease duration (if you are using it)

I've updated my Executors to 6.8 but the behaviour persists.
Do you think moving the central manager to 6.8 can resolve this?

Shadows failing to die when their starter is not talking to them
anymore is not something an upgrade to the collector/negotiator can
solve.

If your executors are on 6.8 you probably want your submitters to be
6.8 as well...

Matt