[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Possible delays for starting a shadow?



Hi,

at the moment I am investigating a case in which a scheduler delays starting a shadow for unknown reasons.

The job is minimal, no file transfers, executes just some echo statements - as soon as the job is actually started, it is done immendiately.

When submitted to another sched (should have same config), there is consistently no delay.

>From the logs, which are currently still on default level I see:

* job is submitted to and transformed on sched
* negotiator matches the job to a worker a short time later

In the working case the shadow is started on the sched without any delay.

Not so on the sched I am looking at

* the job is in running state according to condor_q
* the job ad has no mention of the matched worker node (yet)
* on the worker I find nothing about the job id in the logs 

I have actually a job id right now where I am waiting for the shadow to start with loglevel increased to D_FULLDEBUG but no output for the job yet - but the increase happened after matching, might have missed the interesting stuff.

Do you have any ideas what could cause this behavior?

Best
  Kruno

-- 
------------------------------------------------------------------------
Krunoslav Sever            Deutsches Elektronen-Synchrotron (IT-Systems)
                        Ein Forschungszentrum der Helmholtz-Gemeinschaft
                                                            Notkestr. 85
phone:  +49-40-8998-1648                                   22607 Hamburg
e-mail: krunoslav.sever@xxxxxxx                                  Germany
------------------------------------------------------------------------