[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Wasted resources handling



I think you'll want to take a look at system_periodic_hold. This allows you to set policy on when jobs will be held. 

So for your situations:

- How do you determine if they're using a GPU or not? Maybe only certain applications use GPU? This wouldn't take a user's wrapper script into account, but you can tinker with the parameters.

System_periodic_hold_reason = undefined
System_periodic_hold = False


IsNonGpuApp = ( ! regexp('/(gpuapp1|gpuapp2|gpuapp3)', Cmd) )
HoldNonGpuJobExpr = ( RequestGpu >= 1 && IsNonGpuApp )
HoldNonGpuJobReason = "Job requested GPU, but is not a known GPU application"
submit_attrs = $(submit_attrs) IsNonGpuApp
Startd_attrs = $(startd_attrs) HoldNonGpuJobExpr HoldNonGpuJobReason

System_periodic_hold = $(system_periodic_hold) || HoldNonGpuJobExpr
System_periodic_hold_reason = ifThenElse(HoldNonGpuJobExpr, HoldNonGpuJobReason, \
    $(system_periodic_hold_reason) )

- The newer HTCondor releases have a RequestMemoryUtilizationPercent which you can test against. However you don't want to look at this on a non-running job, or too early before a job has had a chance to grow to its normal size, and you don't want to consider it for small jobs below a certain number of gigabytes, let's say eight. A job using 32 GB but requesting 512 would have a 6.2% utilization:

IsWastingMemory = ( JobStatus == 2 && (time() - EnteredCurrentStatus > 1800) && \
	RequestMemory > 8192 && RequestMemoryUtilizationPercent < 10 )
HoldWastingMemoryExpr = ( IsWastingMemory )
HoldWastingMemoryReason = "Job is using less than 10% of requested memory after 30 minutes."
Submit_attrs = $(submit_attrs) IsWastingMemory
Startd_attrs = $(startd_attrs) HoldWastingMemoryExpr HoldWastingMemoryReason

System_periodic_hold = $(system_periodic_hold) || HoldWastingMemoryExpr
System_periodic_hold_reason = ifThenElse(HoldWastingMemoryExpr, HoldWastingMemoryReason, \
    $(system_periodic_hold_reason) )

- For CPU utilization, you'll assess the total overall utilization for the run based on the overall run time and the RemoteSysCpu and RemoteUserCpu values. Those are a total across all runs, not the latest run, so there's some limits to how well you can dial this in, but it should be good enough to hold a memory hog in its first run. We'll give it 15 minutes to spin up before we assess its utilization. Something like this, to require that a job reach at least 20% of its CPU allocation over the first 15 minutes of its runtime:

RemoteTotalCpuAllocation = \
    ( RequestCpus * ifThenElse(isUndefined(RemoteWallClockTime), 0, RemoteWallClockTime) ) + \
    ( ifThenElse(JobStatus == 2, RequestCpus * (time() - EnteredCurrentStatus), 0) )
RemoteTotalCpu = ( RemoteSysCpu + RemoteUserCpu )
IsWastingCpu = ( JobStatus == 2 && (time() - EnteredCurrentStatus > 900) && \
	( RemoteTotalCpu / RemoteTotalCpuAllocation < 0.20 )
HoldWastingCpuExpr = ( IsWastingCpu )
HoldWastingCpuReason = "Job did not use more than 20% of requested CPU cores in first 15 minutes"
Submit_attrs = $(submit_attrs) RemoteTotalCpuAllocation RemoteTotalCpu IsWastingCpu
Startd_attrs = $(startd_attrs) HoldWastingCpuExpr HoldWastingCpuReason

System_periodic_hold = $(system_periodic_hold) || HoldWastingCpuExpr
System_periodic_hold_reason = ifThenElse(HoldWastingCpuExpr, HoldWasstingCpuReason, \
    $(system_periodic_hold_reason) )

To clarify - a job that requests 20 cpus has been allocated 20 cpu-seconds per second, and if the system and user CPU time don't add up to something close to that rate, the allocation is being underutilized.


As you can see, the hold-reason string is built out using a nested ifThenElse statement, so that you can give proper feedback to the users. I like to use this approach for my start expression too, so that if a fan fails or some such, I'll be able to see why the machine has gone into Owner status.

I just banged out the above off the top of my head, so you'll want to test it out rather than cutting and pasting - there might be typos.

You'll also want to inform your user community beforehand, so they know how to run condor_q -hold in order to find out why they've been busted.


	-Michael Pelletier.


> -----Original Message-----
> From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf
> Of Bert DeKnuydt
> Sent: Thursday, April 20, 2017 10:17 AM
> To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
> Subject: [HTCondor-users] Wasted resources handling
> 
> 
> Hi,
> 
> I see lots of wasted resources in our pool, e.g.:
> 
>    * people requesting a GPU, but not using it
>    * requesting 512GB RAM, and using 32GB
>    * requesting 24 CPUs, and run single threaded stuff
> 
> What do you do against this? General mails, wiki's and info
> sessions don't seem to cut it here.   Are there any analysis
> scripts/tools circulating to bother the perpetrators by mail e.g.
> Or burn their karma... Something in USER_JOB_WRAPPER maybe?
> 
> Billing for resources is, sadly enough, no option here.
> 
> Greetings, Bert.
> 
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with
> a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/