[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Job not starting while matching nodes available // User priority for user@us is not available, attempting to analyze without it



Hi Thomas,

What's the security settings on the worker-nodes in your pool like? 

Is it possible that the new schedd hasn't been added to a whitelist of some sort which is used to populate ALLOW_DAEMON, ALLOW_WRITE etc on a worker-node?

Are there any error messages/connection reset fails shown in the schedd and shadow logs?

Cheers, Iain
________________________________________
From: HTCondor-users [htcondor-users-bounces@xxxxxxxxxxx] on behalf of Thomas Hartmann [thomas.hartmann@xxxxxxx]
Sent: 09 May 2016 12:33
To: htcondor-users
Subject: [HTCondor-users] Job not starting while matching nodes available // User priority for user@us is not available, attempting to analyze without it

Hi all,

we have added a second arc ce at our site submitting to our condor pool.
However, jobs do not get started matching available nodes are available [1].
AFAIS both submit hosts should be identical. But for our new one,
-analysze states for complains that
"User priority for desysgm000@xxxxxxx is not available, attempting to
analyze without it."
which irritates me, since I have assumed the user priority to be a pool
property and not depending on the submit host? The same job submitted on
our working CE/schedd starts immediately [2]

I also tried to boost the priority of the individual jobs but without
success to get them started.

So, I wonder what is missing for our new schedd?

Cheers and thanks,
  Thomas



[1]
> condor_q -n grid-arcce1.desy.de -analyze 2701.0


-- Schedd: grid-arcce1.desy.de : <131.169.223.111:9620?...
---
2701.000:  Request has not yet been considered by the matchmaker.

User priority for desysgm000@xxxxxxx is not available, attempting to
analyze without it.
---
2701.000:  Run analysis summary.  Of 2098 machines,
      0 are rejected by your job's requirements
    126 reject your job because of their own requirements
      0 match and are already running your jobs
   1966 match but are serving other users
      6 are available to run your job


[2]
> condor_q -n grid-arcce0.desy.de -analyze 617405.0


-- Schedd: grid-arcce0.desy.de : <131.169.223.110:9620?...
---
617405.000:  Request is running.