[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Advice on configuring job rounting instead of flock between condor clusters

Currently we are using flocking but we are hitting the issue where jobs
stay in idle status for a long time before getting scheduled. Issue
mentioned in [1]

You probably could have also adjusted FLOCK_INCREMENT to match the length of your flock list, rather than reordering FLOCK_TO. (The default for FLOCK_INCREMENT is 1 to avoid adding unnecessarily to the load of the flocked-to pool's negotiator.)

Hence I thought of exploring job router configuration. While reading [2] I
realized that I need to do customization on both on-premises and cloud
condor clusters.

If the condor-in-the-cloud doesn't need to be its own cluster, consider having the cloud execute nodes join your on-premises pool directly.

On the cloud condor cluster I need to run HTCondor-C which probably requires this configuration [3]..

If you're actually using 7.9, please upgrade. Otherwise, please look things up in the version of the manual corresponding to the version of HTCondor you're using. I don't know if this particular section actually changed, but a lot does change between versions.

	There is no specific configuration required to be the target of
Condor-C, but the source of the jobs must be able to authenticate to the target pool. This will typically (but not always) require additional security configuration on both sides. If you can already flock to the target pool, you may already have done all the configuration you need to do; I'm not all that familiar with the job router.

Also is using a job router instead of flocking is a good idea?

My (limited) understanding is that you won't get much of a speed boost from user the job router unless it's configured to claim jobs very aggressively. I don't know enough about job router configuration to say how easy or hard it is to limit which jobs it picks up quickly or slowly, or how it works in the presence of flocking.

- ToddM