[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Condor_kbdd.exe crashes/doesn't start if no network is available



We had some issues along these lines years ago (7.* series?) related to whether the network
was up or not.

Our auto-install scripts that create the condor service set it to: start = auto (delayed) depend = dhcp
as a workaround. No idea if we still need it with the latest versions.

Cheers

Greg

-----Original Message-----
From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Todd Tannenbaum
Sent: Friday, 19 May 2017 12:03 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Condor_kbdd.exe crashes/doesn't start if no network is available

On 5/18/2017 2:17 AM, Michael Schwarzfischer wrote:
> Dear all,
> We are running condor 8.4.9 on Windows 7 clients. Somehow our network
> seems to have some stability issues from time to time...
>
> Especially, after login it seems to happen that the network connection
> is not yet fully established leading to a crash in the condor_kbdd.exe.
>
> This can easily be simulated by disabling the network adapter. Without
> the connection condor_kbdd.exe just doesn't start. Furthermore, there is
> no logging at all in that case (even with debug logging).
>
> We wonder if there are some workarounds or tricks in order to assure
> that the condor_kbdd is started and to assure that the process is still
> running.
>
> Thanks!
>
> Best,
>
> Michael
>

Strange, my daily driver is a Windows laptop that always runs the 
condor_kbdd.exe - often times I am logging in when no network 
connectivity is available, and have not observed a kbdd problem in 
years.  Admittedly I tend to run the lastest release (currently running 
v8.7.1).  There were some kbdd crash issues fixed back in v8.2.x, but a 
quick scan of the tickets at wiki.htcondor.org does not reveal any known 
problems in v8.4.x.

Just brainstorming, but perhaps you could try telling the condor_kbdd to 
community over the loopback network instead of a "real" IP address (of 
which perhaps you don't have one).  You could append the following the 
your HTCondor config to give this a try:

# Tell the condor_kbdd to only use 127.0.0.1 for any/all communication
# to the startd, and tell the startd to listen on all network interfaces
# (to be certain the startd listens on both ethernet and loopback).
KBDD.NETWORK_INTERFACE = 127.0.0.1
KBDD.BIND_ALL_INTERFACES = False
STARTD.BIND_ALL_INTERFACES = True

Hope the above helps,
Todd

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/