[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Error with global Queue



Both commands yield valid results without errors.

in the Central Managers CollectorLog I have:

>01/08/18 10:03:14 PERMISSION DENIED to condor_pool@xxxxxxxxxxxxx from
host xxx.xxx.xxx.60 for command 10 (QUERY_STARTD_PVT_ADS), access level
NEGOTIATOR: reason: cached result for NEGOTIATOR; see first case for the
full reason
>01/08/18 10:03:14 DC_AUTHENTICATE: Command not authorized, done!
>01/08/18 10:03:20 Got QUERY_STARTD_ADS
>01/08/18 10:03:20 Number of Active Workers 0
>01/08/18 10:03:20 Got QUERY_STARTD_ADS
>01/08/18 10:03:20 Number of Active Workers 0
>01/08/18 10:03:26 Got QUERY_STARTD_PVT_ADS
>01/08/18 10:03:26 Number of Active Workers 0
>01/08/18 10:03:26 Number of Active Workers 0
>01/08/18 10:03:26 DaemonCore: Can't receive command request from
xxx.xxx.xxx.105 (perhaps a timeout?)

xxx.60 is one of my submit nodes, and xxx.105 is the central manager.

There is also a similar entry for other nodes. I looked through for logs
with a bit more detail and got:

>01/08/18 09:59:56 DaemonCore: Can't receive command request from
xxx.xxx.xxx.105 (perhaps a timeout?)
>01/08/18 09:59:56 PERMISSION DENIED to condor_pool@xxxxxxxxxxxxxxxxxxx
from host xxx.xxx.xxx.52 for command 10 (QUERY_STARTD_PVT_ADS), access
level NEGOTIATOR: reason: cached result for NEGOTIATOR; see first case
for the full reason
>01/08/18 09:59:56 DC_AUTHENTICATE: Command not authorized, done!

Thank you for any further insight you can provide!
-Brandon


On 1/8/18 9:46 AM, John M Knoeller wrote:
> I think this means that condor_q is unable to fetch schedd ads from the collector.   
>
> Try running 
>
>    condor_status -schedd
>
> do you get the same error?
>
> does a simple 
>
>    condor_status 
>
> work?
>
> If you look in the CollectorLog on the central manager, do you see any messages about the rejected query?
>
> -----Original Message-----
> From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Brandon Graves
> Sent: Monday, January 8, 2018 11:26 AM
> To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
> Subject: [HTCondor-users] Error with global Queue
>
> Hello All,
>
> I recently replaced my Central Manager, and a few odd things have come
> up. The only definite error message I can find though happens when
> "condor_q -global" is run:
>
>> -- Failed to fetch ads from:
> <xxx.xxx.xxx.49:9618?addrs=xxx.xxx.xxx.49-9618+[> : server1.my.domain.com
>> AUTHENTICATE:1003:Failed to authenticate with any method
>> AUTHENTICATE:1004:Failed to authenticate using GSI
>> GSI:5003:Failed to authenticate. Globus is reporting error
> (851968:50). There is probably a problem with your credentials. (Did
> you run grid-proxy-init?)
>> AUTHENTICATE:1004:Failed to authenticate using KERBEROS
>> AUTHENTICATE:1004:Failed to authenticate using FS
> My basic configuration is Central manager, connected to 2 submit nodes.
> Each submit node seems to be able to see it's own queue, one of the
> submit nodes off and on seems to be having trouble running jobs, but I
> can't seem to find any errors that make sense. For now I'd like to
> figure out the global queue error as I suspect they are related.
>
> My config file as far as authentication goes looks like this:
>
>
>> SEC_PASSWORD_FILE = /etc/condor/pool_password
>> SEC_DAEMON_AUTHENTICATION = REQUIRED
>> SEC_DAEMON_INTEGRITY = REQUIRED
>> SEC_DAEMON_AUTHENTICATION_METHODS = PASSWORD
>> SEC_NEGOTIATOR_AUTHENTICATION = REQUIRED
>> SEC_NEGOTIATOR_INTEGRITY = REQUIRED
>> SEC_NEGOTIATOR_AUTHENTICATION_METHODS = PASSWORD
>> SEC_CLIENT_AUTHENTICATION_METHODS = FS, PASSWORD, KERBEROS, GSI
> ( I didn't do the initial install/configuration of HTcondor on these
> systems, I'm just the new admin for them, and still getting my footing)
>
> I've looked through some of the logs, but I can't seem to find any
> specific error messages that point me in a new direction. Any
> tips/tricks/idea's would be appreciated
>
>
> --Brandon
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/