Mailing List Archives
Public Access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] partitionable slots are not returned?
- Date: Mon, 28 Mar 2011 14:51:03 +0200
- From: Carsten Aulbert <carsten.aulbert@xxxxxxxxxx>
- Subject: Re: [Condor-users] partitionable slots are not returned?
Hi Matt
On Monday 28 March 2011 14:10:14 Matthew Farrellee wrote:
> I would guess that your START is preventing the slot from handling any
> jobs or going back to the Unclaimed state. When a dynamic slot hits
> Unclaimed it gets folded back into the partitionable slot.
>
Let's see, right now, after a lot of testing it looks like this:
condor_status -direct gpu016
Name OpSys Arch State Activity LoadAv Mem
ActvtyTime
slot1@xxxxxxxxxxxx LINUX X86_64 Owner Idle 0.000 7141
0+01:06:49
slot1_1@xxxxxxxxxx LINUX X86_64 Owner Idle 0.000 256
0+01:09:11
slot1_2@xxxxxxxxxx LINUX X86_64 Claimed Idle 0.000 256
0+01:08:53
slot1_3@xxxxxxxxxx LINUX X86_64 Owner Idle 0.000 256
0+00:06:46
Total Owner Claimed Unclaimed Matched Preempting Backfill
X86_64/LINUX 4 3 1 0 0 0 0
StarterLogs look like this:
[...]
03/28 13:36:37 Create_Process succeeded, pid=6240
03/28 13:36:56 Got SIGQUIT. Performing fast shutdown.
03/28 13:36:56 ShutdownFast all jobs.
03/28 13:36:56 Process exited, pid=6240, signal=9
03/28 13:36:56 Last process exited, now Starter is exiting
03/28 13:36:56 **** condor_starter.orig (condor_STARTER) pid 6185 EXITING WITH
STATUS 0
(which was about 70 minutes ago).
StartLog:
[...]
03/28 13:36:56 slot1_1: Called deactivate_claim_forcibly()
03/28 13:36:56 Starter pid 6185 exited with status 0
03/28 13:36:56 slot1_1: State change: starter exited
03/28 13:36:56 slot1_1: Changing activity: Busy -> Idle
03/28 13:36:56 slot1_1: State change: received RELEASE_CLAIM command
03/28 13:36:56 slot1_1: Changing state and activity: Claimed/Idle ->
Preempting/Vacating
03/28 13:36:56 slot1_1: State change: No preempting claim, returning to owner
03/28 13:36:56 slot1_1: Changing state and activity: Preempting/Vacating ->
Owner/Idle
condor_config_val -dump gpu016.atlas.local | grep '^START ='
START = ( Target.Owner =?= "testuser" )
Does this prevent it going back to unclaimed?
> You can identify the slot types with PartitionableSlot, DynamicSlot
> attributes.
Looks fine to me:
condor_status -direct gpu016 -l|grep -E '(PartitionableSlot|DynamicSlot|Name)'
Name = "slot1@xxxxxxxxxxxxxxxxxx"
PartitionableSlot = TRUE
Name = "slot1_1@xxxxxxxxxxxxxxxxxx"
DynamicSlot = TRUE
Name = "slot1_2@xxxxxxxxxxxxxxxxxx"
DynamicSlot = TRUE
Name = "slot1_3@xxxxxxxxxxxxxxxxxx"
DynamicSlot = TRUE
Still puzzled
Cheers
Carsten