[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] matching gpu devicename

How does one match a slot on a machine where the gpu's are mixed by devicename?

for example

if i have a machine with 6 slots where

slot1-4 are
CUDA0DeviceName = Tesla K10.G2.8GB
CUDA1DeviceName = Tesla K10.G2.8GB
CUDA2DeviceName = Tesla K10.G2.8GB
CUDA3DeviceName = Tesla K10.G2.8GB
slot5-6 are
CUDA4DeviceName = Tesla K10.G1.8GB
CUDA5DeviceName = Tesla K10.G1.8GB

because i have multiple GPU device name types in the machines there is
no global CUDADeviceName classad.

if i do

condor_status -constraint 'regexp("Tesla K10.G2.8GB",
CUDA0DeviceName)' everything works out fine

but is there a way to find all the cards in the pool regardless of
which CUDA<num> they are?

this will specifically apply to us because we have users that write
cuda code optimized for a specific gpu and they'll want to put in
their requirements expression regexp("Tesla K10", CUDADeviceName)
which doesn't match anything currently

the difference between the two cards above is airflow direction , not
technical, so they're really the same card