[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] StartLog: Failed to authenticate



condor_who -daemons  on the central manager (also configured as submit role) shows:

Daemon       Alive  PID    PPID   Exit
------       -----  ---    ----   ----
Collector    yes    1608   1494   no
Master       yes    1494   1      no
Negotiator   yes    1609   1494   no
Schedd       yes    1610   1494   no
SharedPort   yes    1607   1494   no

This looks correct but on the execute machine, StartLog has several
ERROR: AUTHENTICATE:1003:Failed to authenticate with any method
and
SECMAN: required authentication with collector failed

The central manager CollectorLog shows similar errors:
DC_AUTHENTICATE: required authentication of 192.168.1.5 failed

The firewall isnât active â Where else should I look?

condor_status returns nothing on the central manager.  Is this because it doesnât see any execute machines?


Thanks,
JK



> On Aug 17, 2023, at 12:28 PM, John M Knoeller <johnkn@xxxxxxxxxxx> wrote:
>
>
>      External Email - Use Caution
>
>
>
> One way to troubleshoot is to run
>
>   condor_who -daemons
>
> On the execute node.  This tool scrapes log files to determine which daemons are alive and which are not.
>
> If the condor_master is running, then you can use
>
>   condor_who -quick
>
> which sends a query to the condor_master about the state of the other daemons.
>
> -tj
>
> -----Original Message-----
> From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Justin Killebrew via HTCondor-users
> Sent: Friday, August 11, 2023 3:03 PM
> To: Todd L Miller <tlmiller@xxxxxxxxxxx>
> Cc: Justin Killebrew <jk@xxxxxxx>; Justin Killebrew via HTCondor-users <htcondor-users@xxxxxxxxxxx>
> Subject: Re: [HTCondor-users] condor_status returns nothing
>
> The StartLog showed that /var/lib/condor/execute didnât exist.  I created it and restarted condor and now condor_status works as expected.
>
> Thanks!
>
> JK
>
>
>> On Aug 11, 2023, at 3:47 PM, Todd L Miller <tlmiller@xxxxxxxxxxx> wrote:
>>
>>
>>    External Email - Use Caution
>>
>>
>>
>>> Should there be a startd running?  How do I troubleshoot this installation?
>>
>>      Yes.  First thing to do is look at the MasterLog and StartLog
>> files (which will probably be in /var/log/condor, but you can run
>> `condor_config_val LOG` to find out for sure).  From your process tree, it
>> looks like either the master isn't starting the startd or the startd is
>> crashing (almost?) immediately on start-up.
>>
>> - ToddM
>
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/