[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] controlling memory intensive jobs



Ian:

I am curious about your dynamic policies now. At our lab these servers
are keep having memory problems .

I looked at http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_MRG/1.1/html/Grid_User_Guide/chap-Grid_User_Guide-Dynamic_provisioning.html
and tried to setup dynamic provision.

I have a question about, SLOT_TYPE_X_PARTITIONABLE

What is "X" ? Do i need to do do this in my configuration?

SLOT_TYPE_0_PARTITIONABLE
SLOT_TYPE_1_PARTITIONABLE
SLOT_TYPE_3_PARTITIONABLE
SLOT_TYPE_4_PARTITIONABLE
SLOT_TYPE_5_PARTITIONABLE
...
SLOT_TYPE_15_PARTITIONABLE

for a 15 core box?

Also, do I also need to do this:
PartitionableSlot=TRUE

I have done condor_status -l  but I don't see DynamicSlot=TRUE


Any thoughts?

TIA

On Thu, Sep 24, 2009 at 9:37 AM, Ian Chesal <ICHESAL@xxxxxxxxxx> wrote:
>> We have 10 servers which have 64GB of memory with 16 cores. We don't
>> want to have people to run all of their memory intensive jobs at once
>> since it would crash the box. What do condor admins typically do to
>> control this? so only 10 jobs runs on 10 different servers?
>
> I make all my users tell me up front how much memory their job needs to
> run. It's a rough guess, but enough to make sure Condor doesn't schedule
> too many memory intensive jobs on my machines. In the back end I bin the
> memory request so jobs are in one of 5 memory size estimate buckets.
> This makes them easier to deal with when planning machine setups. I
> don't allocate my machine resources evenly across slots. I unbalance
> them unpurpose to service the 5 bins of memory requirements accordingly.
>
> It can be less efficient if all the jobs in your queue are in the
> largest memory bin -- you end up with slots that are allocated with too
> little memory to run these going unused. But it's better than having
> jobs fail. And it'll hold until dynamic machine partitioning is
> mainstream in Condor.
>
> - Ian
>
> Confidentiality Notice.
> This message may contain information that is confidential or otherwise protected from disclosure. If you are not the intended recipient, you are hereby notified that any use, disclosure, dissemination, distribution,  or copying  of this message, or any attachments, is strictly prohibited.  If you have received this message in error, please advise the sender by reply e-mail, and delete the message and any attachments.  Thank you.
>
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/
>