[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] matching gpu devicename

On Mon, Jun 1, 2015 at 3:17 PM, John (TJ) Knoeller <johnkn@xxxxxxxxxxx> wrote:
> These are effectively the same device?  It's annoying that they have
> separate names then. You should complain to NVidia... ;)

effectively they are the same card, but there are some design changes
concerning thermals and airflow between the two.  it's annoying, but
unfortunately there's a valid reason.

> But seriously, your best course for now would be to condor_gpu_discovery
> program in a script that re-writes it's output
> to change the various CUDA[n]DeviceName attributes to a single
> CUDADeviceName = "Tesla K10" attribute.

ugh, i was hoping to avoid this, but i guess it's the only option

> Or tell your users to target the GPUs based on capability rather than by
> name.

i'm unclear how this matching takes place.  i don't see the
CUDA_CAPABILITY(?) classads bound to my slots when i run condor_status
-l <slot> are they hidden somewhere or is condor_discovery_gpu not
spitting out everything it should