[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Collector not responding using shared port daemon



Hi Todd,

Thanks for the hint.  Copying & pasting your configuration worked immediately.  I guess the issue was including the ?sock=collector at the end of the COLLECTOR_HOST configuration value?

Anyways it’s working.

-Derek




> On Jun 24, 2015, at 6:11 PM, Todd Tannenbaum <tannenba@xxxxxxxxxxx> wrote:
> 
> On 6/24/2015 5:18 PM, Derek Weitzel wrote:
>> I’m trying to setup a cluster to use the shared port daemon, including the Collector using the shared port daemon.  The collector is not responding to any requests.
>> 
> 
> Is your goal to have everything listen on port 9618 (default), or have
> everything listen on port 9619?  From your config knobs below it looks like you want everything on 9619, so not certain why collector_port is set to 9618... Ie at first blush your config below seems a bit conflicting.
> 
> With v8.3.5, if you want to use the shared port daemon on your central manager (including the collector), I think all you need to change from the default config is:
>  USE_SHARED_PORT = True
> And I think using a non-standard port (eg 9619) would just need the following change from the default config:
>  COLLECTOR_HOST = $(CONDOR_HOST):9619
>  USE_SHARED_PORT = TRUE
> 
> By "default config", I mean the defaults built-into the HTCondor v8.3.5 daemons, not what is sitting around in an old v8.2 condor_config file :). And yes, the settings to do this in v8.3.x are much more simplified (and unfortunately somewhat different) from v8.2.x.  Be warned, I am not at a machine where I can test my bold claims above, but am working from memory on ticket https://goo.gl/9ReCch
> 
> Early in v8.5, the plan is for everything (including the collector) to be setup to use shared_port by default out of the box, so hopefully discussions like this will soon become extinct.
> 
> hope the above helps,
> Todd
> 
> 
>> I get these messages in the shared port log:
>> 06/24/15 17:09:19 SharedPortClient - server response deadline has passed for collector as requested by SCHEDD <X.X.X.X:9619?noUDP&sock=57033_c968_3> on <X.X.X.X:56458>
>> 
>> Here are the relevant configuration options:
>> AUTO_INCLUDE_SHARED_PORT_IN_DAEMON_LIST = true
>> COLLECTOR_USES_SHARED_PORT = true
>> MAX_SHARED_PORT_LOG = $(MAX_DEFAULT_LOG)
>> SHARED_PORT = $(LIBEXEC)/condor_shared_port
>> SHARED_PORT_ADDRESS_REWRITING = false
>> SHARED_PORT_ARGS = -p 9619
>> SHARED_PORT_DAEMON_AD_FILE = $(LOCK)/shared_port_ad
>> SHARED_PORT_DEBUG = D_FULLDEBUG
>> SHARED_PORT_DEFAULT_ID =
>> SHARED_PORT_LOG = $(LOG)/SharedPortLog
>> SHARED_PORT_MAX_FILE_DESCRIPTORS = 4096
>> SHARED_PORT_PORT = 9619
>> USE_SHARED_PORT = True
>> COLLECTOR_HOST = X.X.X.X:9619?sock=collector
>> COLLECTOR_ARGS = -sock collector
>> COLLECTOR_PORT = 9618
>> 
>> And the version information:
>> 06/24/15 16:58:49 ** $CondorVersion: 8.3.5 Apr 28 2015 $
>> 06/24/15 16:58:49 ** $CondorPlatform: X86_64-CentOS_6.6 $
>> 
>> From the top of the CollectorLog:
>> 06/24/15 17:15:56 DaemonCore: non-shared command socket at <X.X.X.X:33947>
>> 06/24/15 17:15:56 Daemoncore: Listening at <0.0.0.0:33947> on TCP (ReliSock) and UDP (SafeSock).
>> 06/24/15 17:15:56 DaemonCore: command socket at <X.X.X.X:9619?noUDP&sock=collector>
>> 06/24/15 17:15:56 DaemonCore: private command socket at <X.X.X.X:9619?noUDP&sock=collector>
>> 
>> Let me know if any more information is required to help debug.
>> 
>> -Derek
>> 
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> HTCondor-users mailing list
>> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>> 
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/htcondor-users/
>> 
> 
> 
> -- 
> Todd Tannenbaum <tannenba@xxxxxxxxxxx> University of Wisconsin-Madison
> Center for High Throughput Computing   Department of Computer Sciences
> HTCondor Technical Lead                1210 W. Dayton St. Rm #4257
> Phone: (608) 263-7132                  Madison, WI 53706-1685
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/

Attachment: smime.p7s
Description: S/MIME cryptographic signature