[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Reserving RAM dynamically



Outline question: is there a way to dynamically reserve some resource (in particular RAM) without actually running a job?

Here's the actual scenario. We are using partionable slots. We have a bunch of jobs which run, where all of them open a very large mmap() read-only database. This uses X GB of memory, but due to sharing, only one chunk of X GB is used regardless of how many jobs have that database open at once.

I'm trying to work out how best to handle this in the condor world.

If we add X to the request_memory of each job, then we run only a fraction of the number of jobs which ought to be able to run concurrently.

If we don't take account of X at all, and each job just declares its "other" memory usage, then we end up with out-of-memory conditions when many instances are running.

If we guess that (say) 20 instances of a job will run at once, then we could add X/20 to the declared memory usage of one job. However this doesn't actually work: given a mixture of different jobs, there might actually be only 1 instance of this job plus 19 instances of other jobs which don't use the database. Again, we run out of memory.

The current approach we are using is to statically subtract X from the total amount of memory available on the server:
MEMORY=($(DETECTED_MEMORY)-xxxxx)

However this is less than ideal because:
1. When we are running jobs which don't use the database, fewer jobs are able to run than otherwise could 2. It won't work in future when we start using multiple databases concurrently, e.g. some jobs open shared database X and some open shared database Y. 3. The size of databases X and Y will grow over time, and we'd rather not reconfigure and restart condor.

So I am wondering if it is possible to do something like:
- before we run any jobs which open X, we would reduce the available memory on that node by X
- and once we no longer have any jobs which use X, we release it

One way to do that would be to run a dummy job which declares that it uses X amount of memory but does nothing. We could also announce a machine classAd attribute saying that database X is available. This would have to be an infinite-running job, and we would have to kill it when we want to remove the memory reservation for X. This sounds messy.

I wonder if there is a more direct way to make a claim on a resource, without actually having to run a job?

I suppose our ideal solution would be to have direct support for shared RAM resources. For example, a job might declare:

request_memory = 1000
request_shared = FOO:8000

If there is no other job which is currently using FOO, then the available memory pool would be reduced by 8000 before this job starts (perhaps by creating a dummy partitioned slot taking 8000); this clearly has an impact on matchmaking. And when all jobs using FOO terminate, the dummy partitioned slot would be dropped.

Any clues for how to deal with this?

Thanks,

Brian Candler.