[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] configuring a GPU machine



Hi, 

I am new to condor and have problems configuring my machine. 

I'm using HTCondor V8.0.0 on a Ubuntu 12.04 machine with 16 CPUs (8 Cores with Hyperthreading) and 4 NVIDIA Tesla C2070 GPUs. I would like to configure condor to 
1. use each GPU combined with 1 CPU as a slot and 
2. each 4 of the remaining 12 CPU as a single slot. 

I managed to provide the slots for GPUs using the following configuration: 

MACHINE_RESOURCE_gpu = 4
MACHINE_RESOURCE_actuator = 20

SLOT_TYPE_1 = gpu=1, cpu=1, actuator=1
NUM_SLOTS_TYPE_1 = 4

condor_status shows these slots correctly. 

Unfortunately I can not get the remaining CPUs to be configured as slots. The following does not show any slots: 

SLOT_TYPE_2 = cpu=1, actuator=1
NUM_SLOTS_TYPE_2 = 12

or 

SLOT_TYPE_2 = cpu=4, actuator=1
NUM_SLOTS_TYPE_2 = 3

I tried several other configurations I found from examples, but in best case could manage one slot type to be shown. 

What would I need to change to make it work?


Assuming the above would work, I'd have two more questions on how to create job submission files: 

1. As configured, the above mentioned GPU slots show 'Arch x64_64' and so would the CPU slots. How can I choose a different executable based on the provided architecture then, as proposed in chapter 2.5.6 (heterogeneous submit) by using the $$(Arch) macro?
2. Is it also possible to choose different arguments to the executables based on the provided 'Arch'? This would  allow to choose the executed code within a single application binary, i.e., figuratively using a 'fat' binary. 


Thank you for your help, 
Tobias