[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] assigned gpu oddity



Yeah. $$ expansion is going to happen against the partitionable slot (the bag of resources) not the dynamic slot that gets created to match the job request.   This is why $$(AssignedGPUs) expands to a list, it's the list of GPUs that the partitionable slot has to hand out. 

I think the environment variables that we setup in the slot is the only thing that the job can use to know what GPU it has been assigned.   

-tj


-----Original Message-----
From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Michael Di Domenico
Sent: Monday, December 21, 2015 3:30 PM
To: Condor-Users Mail List <condor-users@xxxxxxxxxxx>
Subject: [HTCondor-users] assigned gpu oddity

we've switched over from some custom gpu syntax to the new condor style syntax for gpu's.  i'm having an issue with the AssignedGPUs classad however.

we're running a mix of 8.2 and 8.4 on linux x86_64 on top of rhel 6

i followed the wiki page below

https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=HowToManageGpus

I can see the GPU's in the machine classads, my issue comes when i actually run a job

using this sample submit

executable = blah
arguments = $$(AssignedGPUs)
request_gpus=1
request_cpus=1
queue 1

i can see in my environment

CUDA_VISABLE_DEVICES=0
GPU_DEVICE_ORDINAL=0
_CONDOR_AssignedGPUs=CUDA0

but the argument to my program $((AssignedGPUs)) or $1 in my case (its a shell script) is

CUDA0,CUDA1,CUDA2,CUDA3

Shouldn't the AssignedGPUs argument only be the one i was assigned?

we're setup with fully dynamic slots, meaning a node has a single dynamic partitionable slot, where 100% of the cpu/memory/gpu are set

did i set something up wrong?
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/