[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Multiple Network Interface cards and central managernot communicating with execute machine.



You should ensure you have connectivity to the ports Condor is using. The Collector will be on 9618, which might be blocked by a firewall.

Best,


matt


Charles Embry wrote:
> I already have that set on the condor_confiq files of the machines. 
> 144.167.99.210 is the IP of the central manager Network interface card thats connected. Its the only NIC connected on the machine and  it can open a web browser to the internet, ssh and ping other machines on the same router. But in condor the machines will not connect  to each other. I run condor_master on both machines and they can never connect. :(
> 
> ----- Original Message -----
> From: hailong.yang1115 <hailong.yang1115@xxxxxxxxx>
> Date: Thursday, November 19, 2009 9:09 pm
> Subject: Re: [Condor-users] Multiple Network Interface cards and central managernot communicating with execute machine.
> To: Condor-Users Mail List <condor-users@xxxxxxxxxxx>
> 
>>  
>    @font-face { 	font-family: 宋体; } @font-face { 	font-family: Verdana; } @font-face { 	font-family: @宋体; } @page Section1 {size: 595.3pt 841.9pt; margin: 72.0pt 90.0pt 72.0pt 90.0pt; layout-grid: 15.6pt; } P.MsoNormal { 	TEXT-JUSTIFY: inter-ideograph; TEXT-ALIGN: justify; MARGIN: 0cm 0cm 0pt; FONT-FAMILY: "Times New Roman"; FONT-SIZE: 10.5pt } LI.MsoNormal { 	TEXT-JUSTIFY: inter-ideograph; TEXT-ALIGN: justify; MARGIN: 0cm 0cm 0pt; FONT-FAMILY: "Times New Roman"; FONT-SIZE: 10.5pt } DIV.MsoNormal { 	TEXT-JUSTIFY: inter-ideograph; TEXT-ALIGN: justify; MARGIN: 0cm 0cm 0pt; FONT-FAMILY: "Times New Roman"; FONT-SIZE: 10.5pt } A:link { 	COLOR: blue; TEXT-DECORATION: underline } SPAN.MsoHyperlink { 	COLOR: blue; TEXT-DECORATION: underline } A:visited { 	COLOR: purple; TEXT-DECORATION: underline } SPAN.MsoHyperlinkFollowed { 	COLOR: purple; TEXT-DECORATION: underline } SPAN.EmailStyle17 { 	FONT-STYLE: normal; FONT-FAMILY: Verdana; COLOR: windowtext; FONT-WEIGHT: normal; TE
XT-DECORATION: none; mso-style-type: personal-compose } DIV.Section1 { 	page: Section1 } UNKNOWN { 	FONT-SIZE: 10pt } BLOCKQUOTE { 	MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px; MARGIN-LEFT: 2em } OL { 	MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px } UL { 	MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px }  -----------------------------------------------------------
> | 
> 
>  
>> Hi Charles, >  > You can try to add the following: > NETWORK_INTERFACE=your specific network  interface > into the configuration file to see if it  works. >  > Good luck! >  > -Hailong >  > 2009-11-20  -----------------------------------------------------------
>   >  >  > ***********************************************
>> * Hailong Yang, PhD.  Candidate 
>> * Sino-German Joint Software Institute, 
>> * School of Computer  Science&Engineering, Beihang University
>> * Phone: (86-010)82315908
>> *  Email: hailong.yang1115@xxxxxxxxx
>> *  Address: G413, New Main Building in Beihang University,  
>> *               No.37 XueYuan Road,HaiDian District,  
>> *               Beijing,P.R.China,100191
>> *********************************************** -----------------------------------------------------------
>   > 发件人: Charles Embry  > 发送时间: 2009-11-20  05:29:53   > 收件人: condor-users  > 抄送:  > 主题: [Condor-users] Multiple  Network Interface cards and central managernot communicating with execute  machine.  >   > > The condor pool that I am trying to set up is on  the same server rack/router and the machines can ping each other and ssh each  other. But in condor they don;t seem to be communicating, condor_status never  shows the the execute machine that I am trying to add to the central  manager(that is also a submit and execute machine) . The machines are all  sunfire Sun mirosystems servers. they all have 4 NICS, (Network Interface cards)  We are only using one(we have no need at this time to use all of them) and the  other three on each machine is not hooked up to anything.
>> On the execute  machine i get this error in the logs fie
>    
>> Master  log__________
>>
>> 11/16 17:07:18 DaemonCore: Command Socket at  <144.167.99.201:49652>
>> 11/16 17:07:18 Started DaemonCore process  "/root/Desktop/condor-7.2.4/sbin/condor_startd", pid and pgroup = 27436
>> 11/16  17:07:23 attempt to connect to <144.167.99.210:9618> failed: No route to  host (connect errno = 113).  Will keep trying for 20 total seconds (20 to  go).
>>
>> 11/16 17:07:44 attempt to connect to <144.167.99.210:9618>  failed: No route to host (connect errno =  113).
>>
>> StartLog__________
>> 11/19 15:48:58 slot1: State change: IS_OWNER  is false
>> 11/19 15:48:58 slot1: Changing state: Owner -> Unclaimed
>> 11/19  15:49:23 attempt to connect to <144.167.99.210:9618> failed: No route to  host (connect errno = 113).
>> 11/19 15:49:23 ERROR: SECMAN:2004:Was waiting for  TCP auth session to <144.167.99.210:9618>, but it failed.
>> 11/19  15:49:23 Failed to start non-blocking update to  <144.167.99.210:9618>.
>> 11/19 15:49:23 ERROR: SECMAN:2004:Was waiting  for TCP auth session to <144.167.99.210:9618>, but it failed.
>> 11/19  15:49:23 Failed to start non-blocking update to  <144.167.99.210:9618>.
>> 11/19 15:49:23 ERROR: SECMAN:2004:Was waiting  for TCP auth session to <144.167.99.210:9618>, but it failed.
>> 11/19  15:49:23 Failed to start non-blocking update to  <144.167.99.210:9618>.
>> 11/19 15:49:23 ERROR: SECMAN:2004:Failed to  create security session to <144.167.99.210:9618> with TCP.|SECMAN:2003:TCP  connection to <144.167.99.210:9618> failed.
> 
>  
>> The condor_collector  Dameon  is using the 9618 socket  on the central manager and thats the  socket on the central manager that the execute machine is trying to connect to..  Why do the machines not connect in condor(No route to host??) when they can ping  and ssh each other? Do i need to set something to make condor use the only  network interface that is connected,? Or is it the socket that is being used by  the collector on the central manager?                   
> 
>  
> 
>  
>> Thanks for the much needed  help.
> 
>  
> 
>  
>>                  
> 
>  
> 
>  |
> -----------------------------------------------------------
>  > _______________________________________________
>> Condor-users mailing list
>> To unsubscribe, send a message to condor-users-
>> request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>
>> The archives can be found at: 
>> https://lists.cs.wisc.edu/archive/condor-users/
> 
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at: 
> https://lists.cs.wisc.edu/archive/condor-users/