[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Reserving RAM dynamically



On 11/16/15 7:28 AM, Brian Candler wrote:
> I'm trying to work out how best to handle this in the condor world.

This really isn't a HTCondor thing. This is interprocess communication
(IPC) and shared memory (SHM) within your programs. That is, your
programs need to be written such that:

* The initial instance (the master) on a given nodes sets up the shared
memory segments. It needs to permit (or restrict; I'm not entirely clear
on how shmget() and friends function) access to these segments from
other processes with the correct UID or GID. It needs to listen for IPC
requests from other instances of itself.

* Secondary instances need to detect themselves as subsequent instances.
They need to identify the master and obtain the relevant pointers to the
allocated SHM segments via IPC.

* Secondary instances probably need to inform the master when they are
about to exit.

* The master probably needs to stay running until all secondaries have
exited. Then it can cleanly release allocated memory and exit.

Most practical examples that I am aware of simplify these problems by
separating the server (master) from client (secondaries) and running
them separately, sometimes with the masters on dedicated nodes. This
model does not lend itself to use under batch queues. The queue manager
could function as a master but there are security concerns that make
this non-trivial to implement.

-- 
Rich Pieri <ratinox@xxxxxxx>
MIT Laboratory for Nuclear Science