Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] DeviceGpusAverageUsage and GpusAverageUsage

Date: Wed, 18 May 2022 13:27:51 -0500 (CDT)
From: Todd L Miller <tlmiller@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] DeviceGpusAverageUsage and GpusAverageUsage

So, at least, I understand that we can play with DeviceGpusAverageUsage to
check if the utilization is 0, but I do not understand the connection
between DeviceGpusAverageUsage and GpusAverageUsage or why the
GpusAverageUsage is undefined while the DeviceGpusAverageUsage is not.


If I recall correctly --

The GPU monitor can only monitor the utilization of a given GPU;it knows nothing about which jobs are using which device. It reports the"Device*" values for each GPU to the specific slot assigned that GPU."GPUsAverageUsage" is a per-_job_ attribute, derived from the "Device*"values, and is set in the _job_ by the startd. Those job-ad attributesare mirrored into the slot ad by STARTD_JOB_ATTRS.

Additionally, none of this works for sufficiently-short jobs,although since you're talking about checking four hours in, that shouldn'tbe a problem.

I haven't tested this recently, but last time I did,average GPU utilization and peak GPU memory usage were certainly beingrecorded in the job log (where the other usage is reported), and I believein the job ad as well. AFAIK, there's no reason why the whole job adwouldn't be written to the history file.


- ToddM

Follow-Ups:
- Re: [HTCondor-users] DeviceGpusAverageUsage and GpusAverageUsage
  - From: Todd Tannenbaum

References:
- [HTCondor-users] DeviceGpusAverageUsage and GpusAverageUsage
  - From: Carles Acosta
- Re: [HTCondor-users] DeviceGpusAverageUsage and GpusAverageUsage
  - From: Beyer, Christoph
- Re: [HTCondor-users] DeviceGpusAverageUsage and GpusAverageUsage
  - From: Carles Acosta

Prev by Date: Re: [HTCondor-users] job router debugging
Next by Date: Re: [HTCondor-users] Job Transform
Previous by thread: Re: [HTCondor-users] DeviceGpusAverageUsage and GpusAverageUsage
Next by thread: Re: [HTCondor-users] DeviceGpusAverageUsage and GpusAverageUsage
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

Re: [HTCondor-users] DeviceGpusAverageUsage and GpusAverageUsage