[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Feature Request: ability to set processoraffinity for a slot



2009/1/14 Ian Chesal <ICHESAL@xxxxxxxxxx>:
>> 2009/1/8 Ian Chesal <ICHESAL@xxxxxxxxxx>:
>> [snip]
>>> I'd like to be able to tell Condor, preferrably on a per-job basis
> but
>>> even on a per-slot basis this would be great, to spawn the job
> command
>>> with a processor affinity bit mask so the command and any child
> threads
>>> from it are locked to a particular processor on a machine (or a
>>> particular set of processors).
>>>
>>> Right now I'm handling this by wrapping my *real* command in a higher
>>> level command that Condor calls and the wrapper calls my real command
>>> with a custom executable I've written that sets the processor
> affinity
>>> bit mask for the real command.
>> [snip]
>>
>> I'm currently also investigating a similar problem to this. I've got a
>> USER_JOB_WRAPPER script that sets processor affinity based on the
>> value of the environment variable _CONDOR_SLOT. What I've not yet
>> managed to figure out is a way to make this integrate with the
>> RequiresWholeMachine mechanism from http://nmi.cs.wisc.edu/node/1482.
>>
>> Basically putting
>> +RequiresWholeMachine = true
>> in a job submission causes the job to only run on slot1 and other
>> slots on the machine to be set to owner. Which brings me to my real
>> question: How can I cause the +RequiresWholeMachine in the job
>> submission to set an environment variable for the USER_JOB_WRAPPER
>> script?
>>
>> I slightly optimistically tried using:
>> STARTER_JOB_ENVIRONMENT =
>> WHOLE_MACHINE_JOB=$$([$(MY.RequiresWholeMachine)])
>> Which didn't work, as I suspected.
>>
>> Any ideas?
>
> You can tell condor_submit to put this in the environment of the job as
> part of the submission ticket:
>
> environment = WHOLE_MACHINE_JOB=$(RequiresWholeMachine)
>
> If that doesn't work you can tell the Startd to advertise
> RequiresWholeMachine as part of it's ClassAd when I a job starts by
> adding the following to your condor_config file on the machine:
>
> STARTD_JOB_EXPRS = $(STARTD_JOB_EXPRS), RequiresWholeMachine
I'm already doing this for the start and suspend conditions to work. I
could just read the machine's or job's classad in the USER_JOB_WRAPPER
script using condor_status or whatever, but that seems over the top
really for something I thought I should be able do set in the
environment

> And then putting the following in your submit ticket:
>
> environment = WHOLE_MACHINE_JOB=$$(RequiresWholeMachine)
>
> One of those will work. I think. :)
>
I was hoping to avoid requiring any modification to submit tickets if
at all possible (otherwise it makes it possible for a user to omit one
half of this). Users are already using the RequiresWholeMachine setup,
and I was looking to use processor affinity to enforce the slots or
whole machine rules specified.

Alan