[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Jobs being shutdown immediately.




The PREEMPT expression has nothing to do with preemption of one job by another. It is for kicking a job off of a machine because of the machine policy (e.g. because the machine is needed for some other purpose).

Run the following command to see your PREEMPT expression on the execute machine where you are having the problem:

condor_config_val -v PREEMPT

--Dan

Mark Tigges wrote:
That was the first thing I tried ...  we've been using it like that
forever on our current farm at our central location. The reason is
that we have a tonne of short jobs and only a very few large jobs.
So, if there are competing jobs, with PREEMPT on short jobs take
precendence.  Right?

Regardless ... these tests, with the log I previously sent is with
only one job being submitted to a farm of three machines.  It's
getting preempted when nothing else is reported by condor_q -global.
The farm hasn't been deployed to artists yet.  condor_q -analyze says
removed for an unknown reason.

Mark.

On Thu, Sep 17, 2009 at 6:14 AM, David Watrous
<dwatrous@xxxxxxxxxxxxxxxxxx> wrote:
Mark,
Check your PREEMPT expression on the workstation.  It is evaluating to True
and causing the job to terminate.
Hope this helps,
Dave
--
===================================
David Watrous
main: 888.292.5320
Cycle Computing, LLC
Leader in Condor Grid Solutions
Enterprise Condor Support and Management Tools
http://www.cyclecomputing.com
http://www.cyclecloud.com
On Sep 17, 2009, at 12:24 AM, Mark Tigges wrote:

We have condor (7.0.5) running just fine at our own studio.  I'm
trying to set it up remotely in
Shanghai, everything is running alright.  If I try simple hello world
batch files, all works great.

As soon as I try a bigger job, rendering an image for a few minutes
jobs get scheduled,
start, then go down right away into idle.  Wait 4 minutes and the
cycle repeats itself.  I've been
reading manuals for hours, googling, and tearing my hair out.  Here's
the starter log from the
machine running the job.

9/17 12:06:09 match_info called
9/17 12:06:09 Received match <10.88.70.102:64805>#1253158085#15#...
9/17 12:06:09 State change: match notification protocol successful
9/17 12:06:09 Changing state: Unclaimed -> Matched
9/17 12:06:10 Request accepted.
9/17 12:06:10 Remote owner is yhong@***********
9/17 12:06:10 State change: claiming protocol successful
9/17 12:06:10 Changing state: Matched -> Claimed
9/17 12:06:14 Got activate_claim request from shadow (<10.88.70.26:4063>)
9/17 12:06:14 Remote job ID is 75.0
9/17 12:06:14 Got universe "VANILLA" (5) from request classad
9/17 12:06:14 State change: claim-activation protocol successful
9/17 12:06:14 Changing activity: Idle -> Busy
9/17 12:06:19 State change: PREEMPT is TRUE
9/17 12:06:19 Changing activity: Busy -> Retiring
9/17 12:06:19 State change: claim retirement ended/expired
9/17 12:06:19 State change: WANT_VACATE is FALSE
9/17 12:06:19 Changing state and activity: Claimed/Retiring ->
Preempting/Killing
9/17 12:06:20 Got KILL_FRGN_JOB while in Preempting state, ignoring.
9/17 12:06:20 Got RELEASE_CLAIM while in Preempting state, ignoring.
9/17 12:06:20 Starter pid 3524 exited with status 0
9/17 12:06:20 State change: starter exited
9/17 12:06:20 State change: No preempting claim, returning to owner
9/17 12:06:20 Changing state and activity: Preempting/Killing -> Owner/Idle
9/17 12:06:20 State change: IS_OWNER is false
9/17 12:06:20 Changing state: Owner -> Unclaimed
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/



_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/


_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: https://lists.cs.wisc.edu/archive/condor-users/