[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] job evictions



I spoke with Jaime and it looks like preemption was occurring on the negotiator because it was using condor 7.8 and that has preemption enabled by default.

Suchandra

Suchandra Thapa
sthapa@xxxxxxxxxxxxxxx
Computation Institute
Searle Chemistry Laboratory #201A
5735 South Ellis Avenue
Chicago, IL 60637


On Fri, Nov 21, 2014 at 8:42 AM, Ben Cotton <ben.cotton@xxxxxxxxxxxxxxxxxx> wrote:
On Wed, Nov 19, 2014 at 12:28 PM, Suchandra Thapa <ssthapa@xxxxxxxxxxxx> wrote:
> How do I get detailed information about why a job was evicted from a job
> slot? We have an user whose jobs keep getting evicted even though the
> configuration doesn't have any preemption enabled.

Are you sure you don't have preemption enabled? There are three places
preemption might occur: in the negotiator, in the startd, and in the
schedd (only if using a dedicated scheduler). See section 3.5.9.5 of
the manual[1] (for versions 8.0 and prior) for an explanation of
disabling negotiator- and startd-based preemption.

Depending on your START configuration, the job may also be evicted due
to keyboard activity, CPU load, etc. I'd suggest looking in
StarterLog.slotX for the slot your job last ran on (check the
LastRemoteHost job attribute) to see why it got kicked off.

[1] http://research.cs.wisc.edu/htcondor/manual/v8.0/3_5Policy_Configuration.html#SECTION00459500000000000000


Thanks,
BC

--
Ben Cotton
main: 888.292.5320

Cycle Computing
Leader in Utility HPC Software

http://www.cyclecomputing.com
twitter: @cyclecomputing
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/