Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] sharing resources without preempt, suspend and kill ?

Date: Wed, 22 Apr 2015 11:28:46 -0400
From: Ben Cotton <ben.cotton@xxxxxxxxxxxxxxxxxx>
Subject: Re: [HTCondor-users] sharing resources without preempt, suspend and kill ?

Laurent,

I would expect the priority mechanism to take care of this for you.
What version of HTCondor are your execute nodes running? One thing
that comes to mind is perhaps you have an infinite CLAIM_WORKLIFE, so
the schedd/startd keep working together without going back to the
negotiator for a match. The default value of CLAIM_WORKLIFE changed
from -1 (infinite) in 7.8 to 3600 (1 hour) in 8.0 to 1200 (20 minutes)
in 8.2. So if your CLAIM_WORKLIFE is infinite, you may consider
shortening that and see if that helps. The tradeoff is that you'll
increase the scheduling overhead, but for a pool of your size it's not
going to be an issue.

I wrote a post on the Cycle Computing blog about this last year that
might be helpful:
http://www.cyclecomputing.com/blog/how-to-use-htcondors-claim_worklife-to-optimize-cluster-throughput/


Thanks,
BC

-- 
Ben Cotton
main: 888.292.5320

Cycle Computing
Better Answers. Faster.

http://www.cyclecomputing.com
twitter: @cyclecomputing

Follow-Ups:
- Re: [HTCondor-users] sharing resources without preempt, suspend and kill ?
  - From: Laurent Wandrebeck

References:
- [HTCondor-users] sharing resources without preempt, suspend and kill ?
  - From: Laurent Wandrebeck

Prev by Date: Re: [HTCondor-users] Fedora 21
Next by Date: Re: [HTCondor-users] Fedora 21
Previous by thread: [HTCondor-users] sharing resources without preempt, suspend and kill ?
Next by thread: Re: [HTCondor-users] sharing resources without preempt, suspend and kill ?
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

Re: [HTCondor-users] sharing resources without preempt, suspend and kill ?