[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] setting memory available to cores



pslots is the way to go here: http://research.cs.wisc.edu/htcondor/manual/v8.0/3_5Policy_Configuration.html#SECTION004510600000000000000

++ local resource limits around GPU.. https://htcondor-wiki.cs.wisc.edu/index.cgi/tktview?tn=2905
|| other GPU recipes: https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=HowToManageGpus

Cheers,
Tim


From: "Hugh Jennings" <hugh@xxxxxxxxxxx>
To: "HTCondor-Users Mail List" <htcondor-users@xxxxxxxxxxx>
Sent: Tuesday, July 2, 2013 12:38:10 PM
Subject: [HTCondor-users] setting memory available to cores

Hi all,

I am hoping there is a way to assign memory to slots rather than be limited by evenly distributing the RAM between all cores.

The processes we run require a GPU and need as much as 30GB.  Our current systems only have 1 GPU but have up to 8 cores available.  When I start Condor with the default config I see

Name               OpSys      Arch   State     Activity LoadAv Mem   ActvtyTime

slot1@xxxxxxxxxxxx LINUX      X86_64 Unclaimed Idle      0.080 4038  0+00:14:27
slot2@xxxxxxxxxxxx LINUX      X86_64 Unclaimed Idle      0.000 4038  0+00:14:48
slot3@xxxxxxxxxxxx LINUX      X86_64 Unclaimed Idle      0.000 4038  0+00:14:49
slot4@xxxxxxxxxxxx LINUX      X86_64 Unclaimed Idle      0.000 4038  0+00:14:50
slot5@xxxxxxxxxxxx LINUX      X86_64 Unclaimed Idle      0.000 4038  0+00:10:08
slot6@xxxxxxxxxxxx LINUX      X86_64 Unclaimed Idle      0.000 4038  0+00:10:09
slot7@xxxxxxxxxxxx LINUX      X86_64 Unclaimed Idle      0.000 4038  0+00:10:10
slot8@xxxxxxxxxxxx LINUX      X86_64 Unclaimed Idle      0.000 4038  0+00:14:46
                     Total Owner Claimed Unclaimed Matched Preempting Backfill

        X86_64/LINUX     8     0       0         8       0          0        0

               Total     8     0       0         8       0          0        0

I would like to either: 
  1. Assign most if not all memory to a single slot
  2. or remove the other slots from the available pool
I have added the following config:

## GPU Config stuff
SLOT1_HAS_GPU=TRUE
SLOT1_GPU_DEV=0
SLOT2_HAS_GPU=FALSE
SLOT3_HAS_GPU=FALSE
SLOT4_HAS_GPU=FALSE
SLOT5_HAS_GPU=FALSE
SLOT6_HAS_GPU=FALSE
SLOT7_HAS_GPU=FALSE
SLOT8_HAS_GPU=FALSE
STARTD_ATTRS=HAS_GPU,GPU_DEV

to the condor_config.local.

Eventually I want to take proper advantage of HTCondor and have a pool of our machines but I am having problems getting the authorization/authentication working.

I am not a trained admin so I would appreciate instructions or advice that is as explicit as possible.

Regards,

Hugh

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/