[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] windows xp log off kills jobs



We are not, though that looks useful and we probably will start using it
for another type of job we run under Condor.

We just tested your 2nd question.  Yes, my Condor jobs are killed when
someone else logs on, then logs off.

Below is the portion of the StartLog on the machine, with my comments.

12/31 08:23:09 vm1: Got activate_claim request from shadow
(<136.200.32.179:4851>)
12/31 08:23:09 vm1: Remote job ID is 413.7
12/31 08:23:10 vm1: Got universe "VANILLA" (5) from request classad
12/31 08:23:10 vm1: State change: claim-activation protocol successful
12/31 08:23:10 vm1: Changing activity: Idle -> Busy
## the other person touches the keyboard to login; job on vm1 suspended
12/31 08:54:01 vm1: State change: SUSPEND is TRUE
12/31 08:54:01 vm1: Changing activity: Busy -> Suspended
## other person logs out; apparently jobs on vm2, 3, and 4 are forced
off.  Why?
12/31 08:55:23 DaemonCore: Command received via TCP from host
<136.200.32.179:2291>
12/31 08:55:23 DaemonCore: received command 404
(DEACTIVATE_CLAIM_FORCIBLY), calling handler (command_handler)
12/31 08:55:23 vm2: Called deactivate_claim_forcibly()
12/31 08:55:23 DaemonCore: Command received via TCP from host
<136.200.32.179:2293>
12/31 08:55:23 DaemonCore: received command 404
(DEACTIVATE_CLAIM_FORCIBLY), calling handler (command_handler)
12/31 08:55:23 vm3: Called deactivate_claim_forcibly()
12/31 08:55:23 DaemonCore: Command received via UDP from host
<136.200.32.102:2307>
12/31 08:55:23 DaemonCore: received command 60011 (DC_NOP), calling
handler (handle_nop())
12/31 08:55:23 Starter pid 2404 exited with status 0
12/31 08:55:23 vm2: State change: starter exited
12/31 08:55:23 vm2: Changing activity: Busy -> Idle
12/31 08:55:23 vm2: State change: idle claim shutting down due to
CLAIM_WORKLIFE
12/31 08:55:23 vm2: Changing state and activity: Claimed/Idle ->
Preempting/Vacating
12/31 08:55:23 vm2: State change: No preempting claim, returning to
owner
12/31 08:55:23 vm2: Changing state and activity: Preempting/Vacating ->
Owner/Idle
12/31 08:55:23 vm2: State change: IS_OWNER is false
12/31 08:55:23 vm2: Changing state: Owner -> Unclaimed
12/31 08:55:23 DaemonCore: Command received via TCP from host
<136.200.32.179:2295>
12/31 08:55:23 DaemonCore: received command 404
(DEACTIVATE_CLAIM_FORCIBLY), calling handler (command_handler)
12/31 08:55:23 vm4: Called deactivate_claim_forcibly()
12/31 08:55:23 DaemonCore: Command received via UDP from host
<136.200.32.179:2297>
12/31 08:55:23 DaemonCore: received command 443 (RELEASE_CLAIM), calling
handler (command_release_claim)
12/31 08:55:23 Warning: can't find resource with ClaimId
(<136.200.32.102:1037>#1198783236#65#...)
12/31 08:55:23 Starter pid 4084 exited with status 0
12/31 08:55:23 vm3: State change: starter exited
12/31 08:55:23 vm3: Changing activity: Busy -> Idle
12/31 08:55:23 vm3: State change: idle claim shutting down due to
CLAIM_WORKLIFE
12/31 08:55:23 vm3: Changing state and activity: Claimed/Idle ->
Preempting/Vacating
12/31 08:55:23 vm3: State change: No preempting claim, returning to
owner
12/31 08:55:23 vm3: Changing state and activity: Preempting/Vacating ->
Owner/Idle
12/31 08:55:23 vm3: State change: IS_OWNER is false
12/31 08:55:23 vm3: Changing state: Owner -> Unclaimed
12/31 08:55:23 DaemonCore: Command received via UDP from host
<136.200.32.102:2309>
12/31 08:55:23 DaemonCore: received command 60011 (DC_NOP), calling
handler (handle_nop())
12/31 08:55:23 Starter pid 3284 exited with status 0
12/31 08:55:23 vm4: State change: starter exited
12/31 08:55:23 vm4: Changing activity: Busy -> Idle
12/31 08:55:23 vm4: State change: idle claim shutting down due to
CLAIM_WORKLIFE
12/31 08:55:23 vm4: Changing state and activity: Claimed/Idle ->
Preempting/Vacating
12/31 08:55:23 vm4: State change: No preempting claim, returning to
owner
12/31 08:55:23 vm4: Changing state and activity: Preempting/Vacating ->
Owner/Idle
12/31 08:55:23 vm4: State change: IS_OWNER is false
12/31 08:55:23 vm4: Changing state: Owner -> Unclaimed
12/31 08:55:23 DaemonCore: Command received via UDP from host
<136.200.32.102:2313>
12/31 08:55:23 DaemonCore: received command 60011 (DC_NOP), calling
handler (handle_nop())
12/31 08:55:23 DaemonCore: Command received via UDP from host
<136.200.32.179:2299>
12/31 08:55:23 DaemonCore: received command 443 (RELEASE_CLAIM), calling
handler (command_release_claim)
12/31 08:55:23 Warning: can't find resource with ClaimId
(<136.200.32.102:1037>#1198783236#67#...)
12/31 08:55:23 DaemonCore: Command received via UDP from host
<136.200.32.179:2301>
12/31 08:55:23 DaemonCore: received command 443 (RELEASE_CLAIM), calling
handler (command_release_claim)
12/31 08:55:23 Warning: can't find resource with ClaimId
(<136.200.32.102:1037>#1198783236#68#...)
## job on vm1 continues from suspension, then is forced off too!
12/31 08:56:05 vm1: State change: CONTINUE is TRUE
12/31 08:56:05 vm1: Changing activity: Suspended -> Busy
12/31 08:56:05 vm2: State change: IS_OWNER is TRUE
12/31 08:56:05 vm2: Changing state: Unclaimed -> Owner
12/31 08:56:05 DaemonCore: Command received via TCP from host
<136.200.32.179:2335>
12/31 08:56:05 DaemonCore: received command 404
(DEACTIVATE_CLAIM_FORCIBLY), calling handler (command_handler)
12/31 08:56:05 vm1: Called deactivate_claim_forcibly()
12/31 08:56:05 DaemonCore: Command received via UDP from host
<136.200.32.102:2374>
12/31 08:56:05 DaemonCore: received command 60011 (DC_NOP), calling
handler (handle_nop())
12/31 08:56:05 Starter pid 1664 exited with status 0
12/31 08:56:05 vm1: State change: starter exited
12/31 08:56:05 vm1: Changing activity: Busy -> Idle
12/31 08:56:06 vm1: State change: START is false
12/31 08:56:06 vm1: Changing state and activity: Claimed/Idle ->
Preempting/Vacating
12/31 08:56:06 vm1: State change: No preempting claim, returning to
owner
12/31 08:56:06 vm1: Changing state and activity: Preempting/Vacating ->
Owner/Idle
12/31 08:56:06 DaemonCore: Command received via UDP from host
<136.200.32.179:2337>
12/31 08:56:06 DaemonCore: received command 443 (RELEASE_CLAIM), calling
handler (command_release_claim)
12/31 08:56:06 Warning: can't find resource with ClaimId
(<136.200.32.102:1037>#1198783236#69#...)
12/31 08:56:08 vm2: State change: IS_OWNER is false
12/31 08:56:08 vm2: Changing state: Owner -> Unclaimed

Ralph Finch
916-653-7552


-----Original Message-----
Hmm... Are you using RunAsOwner? If so does it happen if you run a job
and then someone else logs on then off?

Clutching at straws here...

Matt