[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Jobs take ages to start



On 4/21/23 07:48, Gaetan Geffroy wrote:

Hi,

 

Some of my jobs stay in IDLE for a really long time compared to what I have configured.

On the submit node, I set NEGOTIATOR_INTERVAL to 5 seconds, expecting jobs to start extremely quickly after being submitted.

This is a local pool for test and training purposes, so I actually want the jobs to start and execute almost immediately.

 

When looking at the NegotiatorLog file, it looks like the negotiation cycles are happening as often as intended, yet most of my jobs stay in queue for 5 minutes.

Those are docker universe jobs. Condor_q -better-analyze tells me there are machines capable of running my jobs.

Even weirder, I tried to download one of my docker images as a TAR archive and load manually in Docker, and then the jobs using that image start immediately.

The same jobs using the same image but from a Docker repo take 5 minutes to start.

 

Is there another knob I could turn to speed up the process ?

 


This depends on where in the process the delay is.  Do know know if this is a delay in matchmaking?  in the schedd?  In the starter once the job has begun to start?

-greg