[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] GPUs not detected in 9.0.6 version



the condor_gpu_discovery binary is completely portable,  so could you try copying it from a machine that has  8.8.15 installed to one of the machines that is not detecting GPUs and running it there interactively?

This will help us to know if this is really a problem with the condor_gpu_discovery binary, or something else

thanks
-tj


From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Carles Acosta <cacosta@xxxxxx>
Sent: Tuesday, September 28, 2021 3:20 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [HTCondor-users] GPUs not detected in 9.0.6 version
 
Dear all,

We have recently migrated from HTCondor 8.8.15 to 9.0.6 all our pool (keeping, for now, our old PASSWORD security configuration).

Everything is working fine with the exception of two machines that have GeForce GTX 1050 Ti GPUs. We have realized that the GPU is not detected using HTCondor 9.0.6, while it is detected again with version 9.0.5.

# condor_status slot2@xxxxxxxxxxxx -af Gpus DetectedGpus CondorVersion
1 GPU-c659279d $CondorVersion: 9.0.5 Aug 18 2021 BuildID: 554415 PackageID: 9.0.5-1 $
# condor_status slot2@xxxxxxxxxxxx -af Gpus DetectedGpus CondorVersion
0 0 $CondorVersion: 9.0.6 Sep 23 2021 BuildID: 557184 PackageID: 9.0.6-1 $

We have other GPUs machines (GeForce RTX 2080 Ti or Tesla V100) that are correctly detected with 9.0.6 version, it seems that it just affects these older gpus.

Do you know what is happening? Please let me know if you need further information.

Cheers,

Carles



--
Carles Acosta i Silva
PIC (Port d'Informació Científica)
Campus UAB, Edifici D
E-08193 Bellaterra, Barcelona
Tel: +34 93 581 33 08
Fax: +34 93 581 41 10
Avís - Aviso - Legal Notice:  http://legal.ifae.es