[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Negotiator not running



Quick question: is the fact that this host is the only Central Manager
a problem for the HA daemon? Should it be shutted down when there is
only one host?

Cheers,
Jose

El miÃ, 9 jun 2021 a las 10:21, Jose Caballero
(<jcaballero.hep@xxxxxxxxx>) escribiÃ:
>
> Still getting the same behaviour even after opening the port...
>
> El miÃ, 9 jun 2021 a las 9:44, Jose Caballero
> (<jcaballero.hep@xxxxxxxxx>) escribiÃ:
> >
> > Hi,
> >
> > following recommendation from Steve, I am having a look to HADLog.
> > I see an ERROR message, but I am not sure how to interpret it.
> >
> >            "HAD CONFIGURATION ERROR: pid 32070 address ':51450' not valid"
> >
> > Does that mean that port 51450 needs to be open on the host?
> > Only open ports are 9618 and 9619.
> >
> > As always, thanks a lot in advance.
> > Cheers,
> > Jose
> >
> >
> > El vie, 4 jun 2021 a las 13:58, <timm@xxxxxxxx> escribiÃ:
> > >
> > > yes so look at why the HAD daemon is failing and you will get the answer..
> > >
> > > Steve
> > >
> > > ________________________________
> > > From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of jcaballero.hep@xxxxxxxxx <jcaballero.hep@xxxxxxxxx>
> > > Sent: Friday, June 4, 2021 7:40 AM
> > > To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
> > > Cc: Condor-Users Mail List <condor-users@xxxxxxxxxxx>
> > > Subject: Re: [HTCondor-users] Negotiator not running
> > >
> > > Hmm, you may be right...
> > >
> > > 06/04/21 12:20:42 Started DaemonCore process "/usr/sbin/condor_had",
> > > pid and pgroup = 32215
> > > 06/04/21 12:20:42 Handling daemon-specific command for "NEGOTIATOR"
> > > 06/04/21 12:20:42 DefaultReaper unexpectedly called on pid 32215, status 0.
> > > 06/04/21 12:20:42 The HAD (pid 32215) exited with status 0
> > > 06/04/21 12:20:42 Killing HAD's controllee (NEGOTIATOR)
> > > 06/04/21 12:20:42 restarting /usr/sbin/condor_had in 1384 seconds
> > >
> > >
> > >
> > > El vie, 4 jun 2021 a las 13:37, <timm@xxxxxxxx> escribiÃ:
> > > >
> > > > What does the MasterLog say as you start htcondor up?
> > > > if you have HAD and REPLICATION it could be that it is detecting a negotiator already
> > > > running on another machine and could be that it thinks it does not need to run another one.
> > > >
> > > > Steve
> > > >
> > > > ________________________________
> > > > From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of jcaballero.hep@xxxxxxxxx <jcaballero.hep@xxxxxxxxx>
> > > > Sent: Friday, June 4, 2021 6:09 AM
> > > > To: Condor-Users Mail List <condor-users@xxxxxxxxxxx>
> > > > Subject: [HTCondor-users] Negotiator not running
> > > >
> > > > Hi,
> > > >
> > > > I just deployed a new Central Manager for testing purposes. Even
> > > > though the NEGOTIATOR is included in the list of daemons, it does not
> > > > seem to be running.
> > > > condor_restart is not helping.
> > > >
> > > > Is this expected?
> > > > Any tip on how to troubleshoot and/or fix it will be more than welcome.
> > > >
> > > > [root@cm ~]# cat /etc/redhat-release
> > > > Scientific Linux release 7.9 (Nitrogen)
> > > >
> > > > [root@cm ~]# condor_version
> > > > $CondorVersion: 8.8.12 Nov 24 2020 BuildID: 524104 PackageID: 8.8.12-1 $
> > > > $CondorPlatform: x86_64_CentOS7 $
> > > >
> > > > [root@cm ~]# condor_config_val DAEMON_LIST
> > > > MASTER, COLLECTOR, NEGOTIATOR, HAD, REPLICATION
> > > >
> > > > [root@cm ~]# ps axf | grep -v grep | grep condor
> > > > 19319 ?        Ss     0:10 /usr/sbin/condor_master -f
> > > > 31030 ?        S      0:00  \_ condor_procd -A
> > > > /var/run/condor/procd_pipe -L /var/log/condor/ProcLog -R 1000000 -S 60
> > > > -C 993
> > > > 31031 ?        Ss     0:00  \_ condor_shared_port -f
> > > > 31032 ?        Ss     0:00  \_ condor_collector -f
> > > >
> > > > Thanks a lot in advance.
> > > > Cheers,
> > > > Jose
> > > > _______________________________________________
> > > > HTCondor-users mailing list
> > > > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> > > > subject: Unsubscribe
> > > > You can also unsubscribe by visiting
> > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.cs.wisc.edu_mailman_listinfo_htcondor-2Dusers&d=DwICAg&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYibLRbpYg&m=JKmiZ0EBdadl4ykncuXSztMc_3Mipp2qSiXk_Xh8JbA&s=j--RcrV54ZYQ4ie-js0lBFTH-7DCPPlCWeb8WRbTbfs&e=
> > > >
> > > > The archives can be found at:
> > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.cs.wisc.edu_archive_htcondor-2Dusers_&d=DwICAg&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYibLRbpYg&m=JKmiZ0EBdadl4ykncuXSztMc_3Mipp2qSiXk_Xh8JbA&s=jnqfQYdfVTtI5f9pXnXqWjKhWcNGx52UeTzxO52odv4&e=
> > > > _______________________________________________
> > > > HTCondor-users mailing list
> > > > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> > > > subject: Unsubscribe
> > > > You can also unsubscribe by visiting
> > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.cs.wisc.edu_mailman_listinfo_htcondor-2Dusers&d=DwIGaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYibLRbpYg&m=Z34BoHcPL40KsfEHcCbAwdAD0NpAUtF2teRp8XmJpBw&s=gMbIa9DSpea-oqk2iGa3IS4hyMfOLAcuVrdpaRj0LWU&e=
> > > >
> > > > The archives can be found at:
> > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.cs.wisc.edu_archive_htcondor-2Dusers_&d=DwIGaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYibLRbpYg&m=Z34BoHcPL40KsfEHcCbAwdAD0NpAUtF2teRp8XmJpBw&s=aTnffoK-y4ifUKM6GbUvSleeO2hszqOe5DhpzqUj4VQ&e=
> > >
> > > _______________________________________________
> > > HTCondor-users mailing list
> > > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> > > subject: Unsubscribe
> > > You can also unsubscribe by visiting
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.cs.wisc.edu_mailman_listinfo_htcondor-2Dusers&d=DwIGaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYibLRbpYg&m=Z34BoHcPL40KsfEHcCbAwdAD0NpAUtF2teRp8XmJpBw&s=gMbIa9DSpea-oqk2iGa3IS4hyMfOLAcuVrdpaRj0LWU&e=
> > >
> > > The archives can be found at:
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.cs.wisc.edu_archive_htcondor-2Dusers_&d=DwIGaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYibLRbpYg&m=Z34BoHcPL40KsfEHcCbAwdAD0NpAUtF2teRp8XmJpBw&s=aTnffoK-y4ifUKM6GbUvSleeO2hszqOe5DhpzqUj4VQ&e=
> > > _______________________________________________
> > > HTCondor-users mailing list
> > > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> > > subject: Unsubscribe
> > > You can also unsubscribe by visiting
> > > https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> > >
> > > The archives can be found at:
> > > https://lists.cs.wisc.edu/archive/htcondor-users/