[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Failing to start: Debian 8 (assertion error)



Hi Brian,

I double checked and there is definitely a V8_5_1 tag. I don't know why you cannot find it. However, the head if the V8_5_1-branch is equivalent.

I installed 8.5.1 on my Debian 8 VM from the repository and the condor_master did not blow the assertion and it is using shared port.

Using the shared port daemon by default is new for 8.5.1.

...Tim

On 01/04/2016 03:39 PM, Brian Candler wrote:
I upgraded some machines to 8.5.1 under Ubuntu 14.04 with no problems. This is using the packages from
http://research.cs.wisc.edu/htcondor/ubuntu/development/

But after upgrading some Debian 8 machines to the packages under
http://research.cs.wisc.edu/htcondor/debian/development/
they won't start. Condor MasterLog says:

01/04/16 21:22:33 Using config source: /etc/condor/condor_config
01/04/16 21:22:33 Using local config sources:
01/04/16 21:22:33    /etc/condor/condor_config.local
01/04/16 21:22:33 config Macros = 75, Sorted = 75, StringBytes = 2085, TablesBytes = 2748
01/04/16 21:22:33 CLASSAD_CACHING is OFF
01/04/16 21:22:33 Daemon Log is logging: D_ALWAYS D_ERROR
01/04/16 21:22:33 SharedPortEndpoint: waiting for connections to named socket 7709_a017
01/04/16 21:22:33 ERROR "Assertion ERROR on (s.hasAddrs())" at line 1147 in file /slots/03/dir_15104/userdir/src/condor_daemon_core.V6/daemon_core.cpp

In a git checkout I can find no tag V8_5_1 (only V8_5_0),  but there is a branch origin/V8_5_1-branch. If I check that out, the offending assertion appears to be here:

char const *
DaemonCore::InfoCommandSinfulStringMyself(bool usePrivateAddress)
{
        static char * sinful_public = NULL;
        static char * sinful_private = NULL;
        static bool initialized_sinful_private = false;

        if( m_shared_port_endpoint ) {
                        // We do not advertise (or probably even have) our own network
                        // port.  Instead, we advertise SharedPortServer's port along
                        // with our local id so connections can be forwarded to us.
                char const *addr = m_shared_port_endpoint->GetMyRemoteAddress();

                if( addr ) {
                        // Remote addresses can be accessed from other machines, so
                        // they must have addrs.
                        Sinful s( addr );
                        ASSERT( s.hasAddrs() );
                }

Any ideas what's causing this?

I don't believe there is any particular configuration which is asking for shared ports. I have now reset /etc/condor/condor_config to the distribution one, and pushed out the same /etc/condor/condo as is running happily on the Ubuntu boxes (using ansible), and the problem remains.

However, the Debian boxes *did* previously have a compiled-from-source version of htcondor, so maybe something wasn't cleaned up properly from removing that before installing the package version.

Any ideas?

Thanks,

Brian.



_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

-- 
Tim Theisen
Release Manager
HTCondor & Open Science Grid
Center for High Throughput Computing
Department of Computer Sciences
University of Wisconsin - Madison
4261 Computer Sciences and Statistics
1210 W Dayton St
Madison, WI 53706-1685
+1 608 265 5736