[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor and GPUs



On Wed, Jul 02, 2008 at 10:33:46AM -0400, Fr?d?ric Bastien wrote:
> Hi,
> 
> Sorry for the long post, but I have looked at many way of using SMP
> computer with version 7.0.1 of condor as it is what we have(computer
> with 2 CPU quad core). Here is the current limitation that I found:

Thanks for stepping forward. I have been thinking along these lines
for a while now, too. I've been hesitant to publish my yet unordered
thoughts before...

> I had trouble in doing this. The trouble with this is that
> slotX_ImageSize is not an up to date version of the value in each
> slot. In one case when I did "condor_status -l hostname|grep
> ImageSize", I get:
> 
> [for slot1]
> ImageSize = 1588508
> MonitorSelfImageSize = 30104.000000
> slot1_ImageSize = 1588508
> slot2_ImageSize = 1588508
> [for slot2]
> ImageSize = 29500
> MonitorSelfImageSize = 30104.000000
> slot1_ImageSize = 0
> slot2_ImageSize = 29500
> 
> If condor_status return the same value as the one seen while doing the
> negotiation, this will not work correctly as all slot don't see the
> same version. If I wait 5 minutes the value are correct.

Yes, it takes another negotiation cycle for the master to notice. If
more jobs are matched before this cycle ends, your changed rules won't
be followed.

> An ideal world, each SMP have a pool of available resource. When a job
> have all needed resource, it create a slot and it execute it their.
> One way of doing it without too much modification to condor is to
> generate in advance the slot(ex: one for each cpu) and allow their
> execution only when the pool of resource have what is needed.

Exactly. Pre-defined static slots have to be replaced by something that's
aware of the whole picture (sees the *machine* that hosts the slots which
can be dynamically reconfigured, or even created on demand).

> This is the two current limitation that I see. The first one is the
> priority for me. Or meaby you could add variable like
> pool_TotalMemoryUsed, that use the up to date version or you could
> hardcode TotalMemoryUsed. If you hardcode it, this won't solve the
> issue with custom ressource.

Speaking of resources:
Since I have found that users rarely use the memory requirements they
advertised in their submit file, I have added about 20% of the available
swap space to the "real memory" to allow for memory overcommit. Up to now
this has proven safe enough (we don't have forced job termination in
place); not all virtual memory of the applications run on our pool would
be accessed all the time. We've seen resident/virtual ratios of up to
80%, often a lot smaller.
I'd like not to lose this opportunity (negative RESERVED_MEMORY)...

Of course, with dynamic slot creation, another problem comes along:
If a machine is already partially taken, how to define a ranking among
machines to allow for maximum flexibility in the future? 
Imagine a pool consisting of 100 2-core machines, for simplicity. 
If user A submits 100 jobs each requiring half the total RAM of a machine
(as in "old style" default slots): if we match them against the first
slot of each machine, user B, who submits some "big" jobs (taking almost
all RAM) wouldn't get matched, and the pool would run at 50% efficiency.
This is not as bad as it sounds since currently the only way to get B's
jobs run without risking (total) memory overcommit would be to set aside
a number of 1-slot machines (wasting a CPU core). (Otherwise user B
would have to lie about her memory requirements... and risk heavy swapping.)

I'm sure Condor developers have already thought about such scenarios...
("match by best fit", I'd have to think hard to translate this into 
an algorithm)

IMHO all boils down to dynamic slot definition. Something that would no
longer happen on the execute node but on the master ...

Regards,
 Steffen

-- 
Steffen Grunewald * MPI Grav.Phys.(AEI) * Am Mühlenberg 1, D-14476 Potsdam
Cluster Admin * http://pandora.aei.mpg.de/merlin/ * http://www.aei.mpg.de/
* e-mail: steffen.grunewald(*)aei.mpg.de * +49-331-567-{fon:7233,fax:7298}
No Word/PPT mails - http://www.gnu.org/philosophy/no-word-attachments.html