[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] ENFORCE_CPU_AFFINITY and dynamic slots



Dan,

Thanks for the confirmation. When using a job wrapper, I presume this would have to call taskset directly, right? The problem with that is deciding which processors to set, as the only relevant information that I can see in the .machine.ad file that gets created is the value for TotalSlotCpus, but not a list of "assigned" cores, so I can't see a way to distinguish which cores to assign for different slots on the same machine. Is there a way for slots to share information about eachother? For standard universe jobs the situation's even worse, as they don't even create a .machine.ad file.

Actually, I can see a way, but it seems rather overcomplicated. Hence, I'll outline it here but I would be grateful if you can offer anything simpler.

1) The wrapper script initially scans the scratch directories of all other jobs currently running on this machine (in $(EXECUTE)) for a list of their chosen cores (generated by their wrapper file).
2) The wrapper then generates a list of available cores and uses taskset to reserve them.
3) It then writes a file, e.g. .cores.reserved, with a list of the cores it has chosen (this is the file that it looks for from other jobs in step 1).
4) It execs to the actual job.

Obviously there is any issue of a race condition if multiple jobs are starting up at the same time, so file locks or similar need to be used when reading the .cores.reserved files.

Regards,
Bob


From: Dan Bradley <dan@xxxxxxxxxxxx>

Bob,

You are right.  There is currently no good way to configure cpu affinity for dynamic slots, because SLOT<N>_CPU_AFFINITY isn't really appropriate for dynamic slots.  A job wrapper could be used to do it.

I talked recently with Greg about plans to provide a better solution.

--Dan

On 9/26/12 10:21 AM, Bob Briscoe wrote:
Hi,
Am I right in thinking that the ENFORCE_CPU_AFFINITY setting currently only works with static slots, i.e. either the default case of a 1-core slot, or pinned to nominated cores for a multi-core slot? This is my understanding from reading section 3.3.13 of the 7.8 manual. Is there a way of defining using CPU affinity for dynamic slots? I'm only interested in the Linux case.
Regards,
Bob



_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/