[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] eviction problems with 7.4.2



Is there anything of interest in the Start[er]Log that was not in the logs of 7.0.2? Any Startd or Starter crashes?

Rob

Smith, Ian wrote:
Dear All,

I've recently been taking a look at checkpointing under the vanilla
universe*. I had everything working fine using Condor 7.0.2 on
the execute hosts (running Win XP SP 3) but when I moved
to 7.4.2 there are problems when jobs get evicted.
When this happens because of mouse/keyboard activity I see
the machine go through the usual Claimed/Busy -> Preempting/Vacating -> Preempting/Killing -> Owner
states but the job carries on running according to condor_q
(and the log file).

If I look on the execute host, then the execute directory has been wiped but condor_q insists that
the job is still running. Eventually when the job starts again
I see a "job disconnected" error in the job's log file. As
well as this, none of the output files get returned to the $(SPOOL)
area.

The execute hosts have this config:

WANT_SUSPEND   = FALSE
WANT_VACATE  = TRUE
START	= ( $(UWCS_START) && $(OfficeHours) \
|| ( $(OfficeHours) == FALSE ) && ( $(ShutdownHours) == FALSE ) )
SUSPEND = FALSE
CONTINUE= $(UWCS_CONTINUE)
PREEMPT= $(UWCS_SUSPEND) && $(OfficeHours) KILL= TRUE
which worked fine with 7.0.2.

Any ideas what may be wrong. Could it be something to do with one
of the daemons not receiving a signal from condor_kbdd ?

regards,

-ian.

* I've written up some detailed instructions on this for the benefit of
our users. If anyone is interested I'll post the link here.

--------------------------------------------
Dr Ian C. Smith,
e-Science Team,
The University of Liverpool,
Computing Services Department

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/