[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Dynamic memory for SMP



Hi,

I have added some rules to have condor behave in a better way on smp
machine(8 slots of 1 cores, 1G ram):
STARTD_JOB_EXPRS        = $(STARTD_JOB_EXPRS), ImageSize
TotalMemoryUsed               = ( 0 + slot1_ImageSize +
slot2_ImageSize + slot3_ImageSize + slot4_ImageSize + slot5_ImageSize
+ slot6_ImageSize + slot7_ImageSize + slot8_ImageSize )
START = $(START) && TotalMemoryUsed < TotalMemory

We need this as sometimes we have jobs that need more then 1G or ram
and if we let it fill the computer, it will trash too much.

What the rules make is that if the current jobs use more then the
TotalMemory of the computer, it won't start new jobs event if slot are
available. This limit the trashing on the server.

But I have one trouble, if one such jobs get killed, it won't restart
as the requiment "((Memory * 1024) >= ImageSize)" is false. This
requirement is not in the submit file, so I suppose condor add it as
some others. What I would like is to replace it by
"(((TotalMemory-TotalMemoryUsed) * 1024) >= ImageSize)". So those jobs
that are killed can be restarted.

Is their a way to do it?

thanks

Frederic Bastien

p.s. I know the rules I added have a trouble. If the server is empty
and the user don't specifie an ImageSize, we will start 8 jobs. To
have it work correctly when the server is empty we must have the user
estimate the ImageSize needed.
p.p.s. The rules have another shortcoming, we suppose that jobs have
no locality in the memory access pattern. For example a jobs that need
a total of 2G or ram, could alternete between the two half. If the
speed of alternation is slow, we could allow more jobs to run and
don't have too much trashing, but this is not our case.