[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] condor_userprio's WeightedAccumulatedUsage



Hi Jon,

Looking at the source, it looks like the WeightedAccumulatedUsage
should use the slot weight as Todd said. If you do a `condor_userprio
-l`, does the AccumulatedUsageN match the WeightedAccumulatedUsageN
for the user in question?

Have you checked that the `condor_userprio -resetusage` is resetting
the usage? Are there any other jobs executing from that user?

Have you checked that the slots are correctly reporting their slot
weight? (e.g. with `condor_status -af SlotWeight Name`)


Thanks,
BC

On Thu, Apr 6, 2017 at 5:02 PM, Jon Bernard <jonbernard@xxxxxxxxx> wrote:
> Hi Todd,
>
> SLOT_WEIGHT = Cpus, but the number of cores per slot is only 1.
>
> Jon
>
>
>
> On Thu, Apr 6, 2017 at 3:05 PM, Todd Tannenbaum <tannenba@xxxxxxxxxxx>
> wrote:
>>
>> On 4/6/2017 1:45 PM, Jon Bernard wrote:
>>>
>>> Hi all,
>>>
>>> I'm seeing some strange numbers for WeightedAccumulatedUsage from one of
>>> our pools.
>>>
>>> Our test case is to submit 1000 jobs which sleep 30 seconds. The total
>>> remotewallclocktime for all the jobs is 30,219 seconds. However, the
>>> usage for the user reported by condor_userprio for these jobs is on the
>>> order of 600,000 seconds.
>>>
>>> For jobs which sleep 0 seconds, condor_userprio reports usage of 300,000
>>> to 600,000 seconds, as compared to about 200 seconds of walltime.
>>>
>>> The test script is essentially
>>>
>>> condor_userprio -resetusage <user>
>>> condor_submit sleep30
>>> clusterid=$(condor_q -af clusterid | head -n1)
>>> condor_wait -num 1000 /tmp/$clusterid.log
>>> condor_history -af remotewallclocktime -limit 1000 | awksum
>>> condor_userprio -allusers -const 'name == <user>' -af
>>> WeightedAccumulatedUsage
>>>
>>> Is there a configuration macro which might be affecting this?
>>>
>>> Thanks,
>>> Jon
>>>
>>
>> Hi Jon,
>>
>> What is the value of config knob SLOT_WEIGHT  ?
>>
>> By default, SLOT_WEIGHT = Cpus
>>
>> IIRC, the "Weighted" prefix in WeightedAccumulatedUsage means it takes the
>> SLOT_WEIGHT into account.   So if you are using the default SLOT_WEIGHT =
>> Cpus, then I would expect to see the results you got above if your sleep
>> jobs ran on a lot of 20 core slots, i.e. slots where Cpus=20.  (since 600k
>> seconds / 20 = 30k)
>>
>> Hope the above helps
>> Todd
>>
>> _______________________________________________
>> HTCondor-users mailing list
>> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with
>> a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/htcondor-users/
>
>
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/



-- 
Ben Cotton
Technical Marketing Manager

Cycle Computing
Better Answers. Faster.

http://www.cyclecomputing.com
twitter: @cyclecomputing