[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Translating GPU device assignments?

Hi Michael,

My solution has been to have the job executable be a simple shell script wrapper that creates a Caffe command line with the option -gpu (with whatever CUDA_VISIBLE_DEVICES is set to). It also
works for other packages, e.g. those that require starting python.


> On Jul 2, 2017, at 20:27, Michael Pelletier <Michael.V.Pelletier@xxxxxxxxxxxx> wrote:
> Hi folks,
> I'm working on a GPU problem, and I'm trying to pin down an elegant way to inform the Caffe application which GPU it should use.
> The HTCondor startd provides GPU_DEVICE_ORDINAL and CUDA_VISIBLE_DEVICES, and that's the argument that the "-gpu" command line option for Caffe needs. With another argument, "-fromenv," you can specify which command line arguments should be defined from an environment variable, using an EV named "FLAGS_gpu=1" for example, to set the equivalent of a command line argument of "-gpu=1."
> , 
> However, I'm not quite clear how to translate the HTCondor environment variable into the FLAGS_gpu, or to the command line arguments, in a graceful way.
> Since there's nothing in the job ClassAd which provides this information, I can't use a $$() notation in the arguments line in the submit description.
> I can't specify a post-match environment variable on the arguments line, directly.
> I could submit a "/bin/sh -c" command line which could expand the environment variable as an argument via $(DOLLAR)CUDA_VISIBLE_DEVICES, but that gets rather messy, syntax-wise, in order to clump the arguments correctly..
> I could do a wrapper to assign the environment variable, but that's just a variant of the sh -c approach.
> Any clever suggestions? Is there a submit option I'm overlooking? Thanks!
> Michael V. Pelletier
> Principal Engineer
> Information Technology
> Future Technologies & Cloud
> Integrated Defense Systems
> Raytheon Company
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/