[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [condor-users] Network in Linux-Cluster and MPI



I think that if all your cluster computers are connected to both networks, it 
would be enough to use Condor with one of them.
You should put the IP of the interface, which is connected to the network with 
all computers. For instance, you have 192.168.10.* for all your comps, so you 
should put, say 192.168.10.1 for the first and so on. 
If you have two NON-interconnected networks of SUN and LINUX computers, you 
should setup a gateway as a router, which would forward packets from SUN to 
Linux and back in a transparent manner(from the application point of view), and 
afterwards setup Condor to be on that network, as specified above. 
Mark

Quoting Degi Baatartsogt <baatarts@xxxxxxxxxxxxxxxxx>:

> 
> Hi Mark,
> 
> my problem is, that we have here Linux-Cluster (Beowulf). So our Linux
> Host has two Interfaces. Thatsway i'm trying to use NETWORK_INTEFACE.  I'm
> not sure what kind of address i should use. But i tried all possibilities.
> But as i understand, we cant solve this problem till we get the source
> codes. Is that right?
> 
> On 23 Oct 2003, Mark Silberstein wrote:
> 
> > Well, I would not mix these two things.
> > Why do you use 0.0.0.0 settings for NETWORK_INTERFACE? If you have Linux
> > and SUN pools connected in any way via network, you should not need to
> > configure Condor to listen on more than one NW interface. Can you be
> > more specific about your network topology to understand this?
> > I expect that you would get the same communication problem for whatever
> > job you run, since ALL Condor communications would not work with
> > NETWORK_INTERFACE parameter set to 0.0.0.0
> >
> >
> > On Sun, 2003-10-19 at 17:10, Degi Baatartsogt wrote:
> > > Hi Mark,
> > >
> > > thank you for your response!
> > >
> > > > Sorry, from our experience this won't work. Condor can't really
> listen
> > > > on more than one NW interface, at least we did not succeed. If
> someone
> > > > from the team knows the answer, please share it with us!
> > > > Mark
> > >
> > > Does it mean, that MPI-Condor-Jobs would'nt work on Cluster? Because i
> get
> > > also the same Communication Problem if i submit MPI-MPICH job on Condor
> in
> > > our Cluster.
> > >
> > > Degi
> > >
> > > > On Wed, 2003-10-15 at 14:58, Degi Baatartsogt wrote:
> > > > > Hello everybody,
> > > > >
> > > > > I'm trying to use flocking between Sun pool and Linux pool. For that
> i
> > > > > changed flocking paramenter in both direction and put
> NETWORK_INTERFACE in
> > > > > 0.0.0.0 in global config file. And now i get following messages in
> Log
> > > > > files. Does anybody know, what should i do?
> > > > >
> > > > > Thank you
> > > > > Ms Baatartsogt
> > > > >
> > > > > ==> SchedLog <==
> > > > > 10/15 12:37:59 DaemonCore: Command received via UDP from host
> <127.0.0.1:yyyyy>
> > > > > 10/15 12:37:59 DaemonCore: received command 421 (RESCHEDULE),
> calling
> > > > >                handler (reschedule_negotiator)
> > > > > 10/15 12:37:59 Sent ad to central manager for
> condor@xxxxxxxxxxxxxxxxxx
> > > > > 10/15 12:37:59 Called reschedule_negotiator()
> > > > > 10/15 12:37:59 DaemonCore: PERMISSION DENIED to unknown user from
> host
> > > > >                <127.0.0.1:xxxxx> for command 416 (NEGOTIATE)
> > > > >
> > > > > ==> CollectorLog <==
> > > > > 10/15 12:38:05 DC_AUTHENTICATE: attempt to open invalid session
> ipc654:15713:106
> > > > > 6213385:334, failing.
> > > > > 10/15 12:38:12 DC_AUTHENTICATE: attempt to open invalid session
> ipc654:15713:106
> > > > > 6213692:349, failing.
> > > > > 10/15 12:38:17 DC_AUTHENTICATE: attempt to open invalid session
> ipc654:15713:106
> > > > > ...
> > > > >
> > > > >
> > > > > Condor Support Information:
> > > > > http://www.cs.wisc.edu/condor/condor-support/
> > > > > To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
> > > > > unsubscribe condor-users <your_email_address>
> > > >
> > > > Condor Support Information:
> > > > http://www.cs.wisc.edu/condor/condor-support/
> > > > To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
> > > > unsubscribe condor-users <your_email_address>
> > > >
> > > >
> > >
> > > ---------------------------------------
> > > | Baatartsogt, O                        |
> > > | Max-Steenbeck-Str.3                   |
> > > | D-07745 Jena                          |
> > > | Germany                               |
> > >  ---------------------------------------
> > > | 00-49-3641-820740                     |
> > > | 00-49-174-4805271 (new)               |
> > > | http://www.minet.uni-jena.de/~baatarts|
> > >  ---------------------------------------
> > >
> > > Condor Support Information:
> > > http://www.cs.wisc.edu/condor/condor-support/
> > > To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
> > > unsubscribe condor-users <your_email_address>
> >
> > Condor Support Information:
> > http://www.cs.wisc.edu/condor/condor-support/
> > To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
> > unsubscribe condor-users <your_email_address>
> >
> >
> 
> ---------------------------------------
> | Baatartsogt, O                        |
> | Max-Steenbeck-Str.3                   |
> | D-07745 Jena                          |
> | Germany                               |
>  ---------------------------------------
> | 00-49-3641-820740                     |
> | 00-49-174-4805271 (new)               |
> | http://www.minet.uni-jena.de/~baatarts|
>  ---------------------------------------
> 
> Condor Support Information:
> http://www.cs.wisc.edu/condor/condor-support/
> To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
> unsubscribe condor-users <your_email_address>
> 


Condor Support Information:
http://www.cs.wisc.edu/condor/condor-support/
To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
unsubscribe condor-users <your_email_address>