[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] speeding up matching/scheduling



If you have dynamic slots configured and set on auto, like me, here’s what I use to speed things up a bit:

 

# Enable pslot preemption

ALLOW_PSLOT_PREEMPTION = True

 

# Speed up reclaiming of unused slots

UNUSED_CLAIM_TIMEOUT = 10

 

Martin

 

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Cole Bollig via HTCondor-users
Sent: June 8, 2022 4:41 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Cc: Cole Bollig <cabollig@xxxxxxxx>
Subject: Re: [HTCondor-users] speeding up matching/scheduling

 

Hello Rita,

 

If you are concerned that your pool is acting in a prolonged way that seems abnormal, then we will need some more information to help. Such as more quantitative information of what is occurring with jobs like time it takes for a job to start running, number of jobs in the queue at a time, etc. An example of abnormal activity is we would expect jobs to start running in about 60 seconds from submission time if there enough open slots, but it takes jobs a few minutes to start running.

 

If you are just asking about how to generally set up your pool to be optimized for large numbers of short running jobs, I would suggest looking into our large list of configuration options in the htcondor manual at:

 

Since there are so many configuration 'knobs', I took the time to go through and pull some out that may be beneficial towards your goal. It will probably take some time and tweaking of these 'knobs' to get your desired outcome. 

 

- NUM_CLAIMS

- MAX_JOBS_RUNNING

- MAX_RUNNING_SCHEDULER_JOBS_PER_OWNER

- NEGOTIATOR_INTERVAL

- NEGOTIATOR_MAX_TIME_PER_*

- NEGOTIATOR_DEPTH_FIRST

 

Best of luck,

Cole Bollig


From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Rita <rmorgan466@xxxxxxxxx>
Sent: Monday, June 6, 2022 9:47 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [HTCondor-users] speeding up matching/scheduling

 

We have a relatively small pool -- 128 cores. 

I would like to have condor more responsive in terms of scheduling because the jobs only last 5 mins but we have few hundreds of them. What are some things I can do with the negotiator, collector, etc.. to make it more quicker in handling this sort of workload?

 

--

--- Get your facts first, then you can distort them as you please.--