[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Jobs being shutdown immediately.



Mark, 

Check your PREEMPT _expression_ on the workstation.  It is evaluating to True and causing the job to terminate.

Hope this helps,
Dave

-- 
===================================
David Watrous
main: 888.292.5320

Cycle Computing, LLC
Leader in Condor Grid Solutions
Enterprise Condor Support and Management Tools 


On Sep 17, 2009, at 12:24 AM, Mark Tigges wrote:

We have condor (7.0.5) running just fine at our own studio.  I'm
trying to set it up remotely in
Shanghai, everything is running alright.  If I try simple hello world
batch files, all works great.

As soon as I try a bigger job, rendering an image for a few minutes
jobs get scheduled,
start, then go down right away into idle.  Wait 4 minutes and the
cycle repeats itself.  I've been
reading manuals for hours, googling, and tearing my hair out.  Here's
the starter log from the
machine running the job.

9/17 12:06:09 match_info called
9/17 12:06:09 Received match <10.88.70.102:64805>#1253158085#15#...
9/17 12:06:09 State change: match notification protocol successful
9/17 12:06:09 Changing state: Unclaimed -> Matched
9/17 12:06:10 Request accepted.
9/17 12:06:10 Remote owner is yhong@***********
9/17 12:06:10 State change: claiming protocol successful
9/17 12:06:10 Changing state: Matched -> Claimed
9/17 12:06:14 Got activate_claim request from shadow (<10.88.70.26:4063>)
9/17 12:06:14 Remote job ID is 75.0
9/17 12:06:14 Got universe "VANILLA" (5) from request classad
9/17 12:06:14 State change: claim-activation protocol successful
9/17 12:06:14 Changing activity: Idle -> Busy
9/17 12:06:19 State change: PREEMPT is TRUE
9/17 12:06:19 Changing activity: Busy -> Retiring
9/17 12:06:19 State change: claim retirement ended/expired
9/17 12:06:19 State change: WANT_VACATE is FALSE
9/17 12:06:19 Changing state and activity: Claimed/Retiring ->
Preempting/Killing
9/17 12:06:20 Got KILL_FRGN_JOB while in Preempting state, ignoring.
9/17 12:06:20 Got RELEASE_CLAIM while in Preempting state, ignoring.
9/17 12:06:20 Starter pid 3524 exited with status 0
9/17 12:06:20 State change: starter exited
9/17 12:06:20 State change: No preempting claim, returning to owner
9/17 12:06:20 Changing state and activity: Preempting/Killing -> Owner/Idle
9/17 12:06:20 State change: IS_OWNER is false
9/17 12:06:20 Changing state: Owner -> Unclaimed
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/