[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor with CUDA on windows



This is not likely to be worked on in the short term but I've put
together a ticket that documents some of the info I've dug up:
https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=2771

That said, I'd like to get a better understanding of what environment
people trying to run GPU jobs on Windows are in.  Most of the Linux
users run GPU jobs on dedicated machines because GPU resources can't
really be timeshared.  Are Windows users doing the same?  If so, why
is running a personal Condor in an auto-logged on user unacceptable?
If users are trying to cycle-scavenge Windows machines with GPUs, a
solution that is employed by UW's IT department is to have a dedicated
account which displays a "select OS" window (they use Macs that dual
boot Windows and OS X, but the principle could be applied to Windows
only machines) and only that.  In the background Condor is run.  When
a user wants to use the machine, they choose the OS and the machine
logs out of the dedicated account and ends the Condor jobs.

On Tue, Jan 17, 2012 at 10:21 AM, Lans Carstensen
<lans.carstensen@xxxxxxxxxxxxxx> wrote:
> +1.  To me, that means working towards adding arbitrary per-system
> resource counters for partitionable slots.  Supporting GPU's on a
> partitionable slots configuration today is not practical - i.e.
> advertising a partitionable resource of say 12 CPU's, 48G memory, and
> 2 GPU's, and enabling passing device-related configuration to the
> partitionable slots requesting a GPU.
>
> This model could then be extended to any other specialized compute.
> It could also be used to solve node-locked licensing.  You could also
> stretch it into some of the other resource counting methodologies
> discussed elsewhere, e.g. you could advertise host network bandwidth
> as a partitionable resource.
>
> On Mon, Jan 16, 2012 at 6:45 AM, Tim St Clair <tstclair@xxxxxxxxxx> wrote:
>> Maybe it's worth investigating rCUDA.?.?
>>
>> In general, we really to start treating GPU as a 1st class citizen, maybe target for next series?
>>
>> Cheers,
>> Tim
>>
>> ----- Original Message -----
>>> From: "Ziliang Guo" <ziliang@xxxxxxxxxxx>
>>> To: "Todd Tannenbaum" <tannenba@xxxxxxxxxxx>
>>> Cc: "Condor-Users Mail List" <condor-users@xxxxxxxxxxx>
>>> Sent: Saturday, January 14, 2012 12:53:15 PM
>>> Subject: Re: [Condor-users] Condor with CUDA on windows
>>>
>>> None of my preliminary attempts worked.  This included running the
>>> service under my user account and trying that trick with allow
>>> interaction with user desktop.
>>>
>>> I dug some more and the only "solution" out there is one Nvidia
>>> provided for their Tesla card drivers:
>>>
>>> http://forums.nvidia.com/index.php?s=2cd29b11a0be8fa2e10b06aacd57beda&showtopic=93450&st=20
>>>
>>> People apparently have had success tricking the drivers to work with
>>> non-Tesla cards, but at this point it seems a crapshoot.  The only
>>> "solution" unless Microsoft or Nvidia and AMD work out a way to
>>> provide access to GPUs in session 0 is for Condor to start up the
>>> process inside a non-session 0 session, at which point we need to
>>> test
>>> to make sure the starter can communicate correctly with the startd or
>>> the starter can actually keep track of the running process, etc, etc.
>>> We MIGHT be able to tweak the use visible desktop code to do this.
>>>
>>> Z
>>>
>>> On Fri, Jan 13, 2012 at 4:31 PM, Todd Tannenbaum
>>> <tannenba@xxxxxxxxxxx> wrote:
>>> > On 1/13/2012 4:04 PM, Ziliang Guo wrote:
>>> >>
>>> >> Another user sent a similar question and I responded here:
>>> >> https://lists.cs.wisc.edu/archive/condor-users/2012-January/msg00038.shtml
>>> >>
>>> >> Simplistically, due to the changed security model in Vista and
>>> >> higher,
>>> >> access to the video card is severely restricted in the session 0
>>> >> that
>>> >> Condor and jobs run in.  Currently the only way to run a CUDA or
>>> >> any
>>> >> job that requires a GPU on Windows using Condor is to run Condor
>>> >> as a
>>> >> personal Condor.  This however requires that a user be logged into
>>> >> the
>>> >> machine to start up and keep Condor running.  You also lose the
>>> >> sandboxing from the user that Condor does when running as a
>>> >> service.
>>> >>
>>> >
>>> > So even running Condor as a regular user under the service control
>>> > manager
>>> > is not enough to allow access to the video card on Vista and
>>> > higher?  Is
>>> > there perhaps some Windows registry setting on some such to relax
>>> > this
>>> > restriction?
>>> >
>>> > Sorry I only have questions instead of answers...
>>> >
>>> > Todd
>>>
>>>
>>>
>>> --
>>> Condor Project Windows Developer
>>> _______________________________________________
>>> Condor-users mailing list
>>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx
>>> with a
>>> subject: Unsubscribe
>>> You can also unsubscribe by visiting
>>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>>
>>> The archives can be found at:
>>> https://lists.cs.wisc.edu/archive/condor-users/
>>>
>> _______________________________________________
>> Condor-users mailing list
>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/condor-users/
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/



-- 
Condor Project Windows Developer