[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] GPU Management



Dear Condor users and developers,

we are a research group for computational physics in Stuttgart,
Germany. We use condor to manage a lot of our computing resources.
Recently we have added GPUs to most of our nodes and would like to
include those as separate resources into Condor. We have tried the
recipe prescribed on the internet, namely putting

    SLOT1_HAS_GPU = TRUE
    SLOT1_GPU_DEV=0
    STARTD_ATTRS=HAS_GPU,GPU_DEV,GPU

    RANK = (target.wantGPU =?= true)*10000000

into the individual hosts configuration files. This does allow us to
ask for machines having a GPU in the submit script.  The problem is
that Condor launches as many jobs as there are CPU slots thus making
the jobs run extremely slow.  What we would like to do is make it so
that Condor tries to launch two GPU jobs per node.  We would also like
to make it so that the user can request that theirs be the only GPU
job on the node.

Any help would be very much appreciated.

Owen Hickey