[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] speeding up matching/scheduling



I add few machines from another department overnight for more power then I remove them from the pool. After I remove them, is there a setting to have the schedulersÂ+ collectors recalibrate the pool sizing? At the moment I have to wait 30-40 mins for jobs to get scheduled again even though there are resources (memory & cpu) in the pool.Â

On Thu, Jun 9, 2022 at 9:28 AM Beaumont, Martin <Martin.Beaumont@xxxxxxxxxxxxxxx> wrote:

If you have dynamic slots configured and set on auto, like me, hereâs what I use to speed things up a bit:

Â

# Enable pslot preemption

ALLOW_PSLOT_PREEMPTION = True

Â

# Speed up reclaiming of unused slots

UNUSED_CLAIM_TIMEOUT = 10

Â

Martin

Â

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Cole Bollig via HTCondor-users
Sent: June 8, 2022 4:41 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Cc: Cole Bollig <cabollig@xxxxxxxx>
Subject: Re: [HTCondor-users] speeding up matching/scheduling

Â

Hello Rita,

Â

If you are concerned that your pool is acting in a prolonged way that seems abnormal, then we will need some more information to help. Such as more quantitative information of what is occurring with jobs like time it takes for a job to start running, number of jobs in the queue at a time, etc. An example of abnormal activity is we would expect jobs to start running in about 60 seconds from submission time if there enough open slots, but it takes jobs a few minutes to start running.

Â

If you are just asking about how to generally set up your pool to be optimized for large numbers of short running jobs, I would suggest looking into our large list of configuration options in the htcondor manual at:

https://htcondor.readthedocs.io/en/latest/admin-manual/configuration-macros.html

Â

Since there are so many configuration 'knobs', I took the time to go through and pull some out that may be beneficial towards your goal. It will probably take some time and tweaking of these 'knobs' to get your desired outcome.Â

Â

- NUM_CLAIMS

- MAX_JOBS_RUNNING

- MAX_RUNNING_SCHEDULER_JOBS_PER_OWNER

- NEGOTIATOR_INTERVAL

- NEGOTIATOR_MAX_TIME_PER_*

- NEGOTIATOR_DEPTH_FIRST

Â

Best of luck,

Cole Bollig


From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Rita <rmorgan466@xxxxxxxxx>
Sent: Monday, June 6, 2022 9:47 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [HTCondor-users] speeding up matching/scheduling

Â

We have a relatively small pool -- 128 cores.Â

I would like to have condor more responsiveÂin terms of scheduling because the jobs only last 5 mins but we have few hundreds of them. What are some things I can do with the negotiator, collector, etc.. to make it more quicker in handling this sort of workload?

Â

--

--- Get your facts first, then you can distort them as you please.--

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


--
--- Get your facts first, then you can distort them as you please.--