Hi Nicolas,
I was wondering is your GCB setup is
working now.
I am trying to setup GCB for the machines
behind NAT. And I could not see those machines (running behind NAT) in the central
manager condor_status, neither the GCB machine (where only condor_master is
running).
Since I could not see those NAT behind
machines, I could not force the job to run on these machines just to test, by
specifying the machine name in the requirements.
If I try to submit the job from the
private network machine (behind NAT), I am getting this error, could not
transfer the executable file.
ERROR: failed to transfer executable file
test.sh
StartLog (in job submitting machine)
**********
7/24 08:23:10 GCB: ERROR "GCB_bind:
binding the socket locally failed" errno 98: Address already in use
7/24 08:23:10 GCB: ERROR "GCB_bind:
binding the socket locally failed" errno 98: Address already in use
7/24 08:23:10 GCB: ERROR "GCB_bind:
binding the socket locally failed" errno 98: Address already in use
SchedLog (in job submitting machine)
**********
7/24 07:53:53 (pid:1834) GCB: ERROR
"GCB_bind: binding the socket locally failed" errno 98: Address
already in use
7/24 07:53:53 (pid:1834) GCB: ERROR
"GCB_bind: binding the socket locally failed" errno 98: Address
already in use
7/24 08:50:21 (pid:1834) get_file:
Zero-length file check failed!
7/24 08:50:21 (pid:1834) Failed to receive
file from client in SendSpoolFile.
It looks like public network machine
(Central Manager), GCB Broker and GCB clients are not properly communicating.
Do you know what might be the problem?
Thanks,
Senthil
From:
condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Nicholas Lavigne
Sent: Friday, July 20, 2007 10:20
AM
To: Condor-Users
Mail List
Subject: Re: [Condor-users] Condor
on multiple network interfaces
I've installed the GCB
broker, and from what I can tell from the logs, it is running correctly.
I have also configured the clients as explained in that document, but when
trying to run jobs, it seems like I have made absolutely no progress. The
central manager (which is also the network boundary) is able to match the job
to the execute machine, but the job does not run, and the execute machine's
status stays "Unclaimed".
Is there anyone with a similar setup that would be willing to share their
details (condor_config files) ?
On 7/19/07, Dan
Bradley < dan@xxxxxxxxxxxx>
wrote:
Condor requires bidirectional connectivity between the submit node and
the execute node. In other words, Condor must be able to open up
network connections to the execute machine from the submit machine and
also to the submit machine from the execute machine.
If connections can only be made in one direction in your network ( e.g.
from private to public), then you can configure condor to use GCB to
broker connections in the reverse direction. There's a section in
the
manual about that:
http://www.cs.wisc.edu/condor/manual/v6.8/3_7Networking.html#SECTION00473000000000000000
--Dan
Nicholas Lavigne wrote:
> Thanks for the reply. I now have all of the machines appearing
in the
> pool (as reported by condor_status) but I have a new problem. I
> *think* I understand the problem, but as of yet the solution is
> evading me....
>
> Our network is mostly Windows and so the vanilla universe is my
> primary concern. Now, submitting a job from the public network,
the
> central manager is able to match the job to a machine on the private
> network, but the job does not run, presumably because Condor's file
> transfer mechanism does not know how to transfer the file from the
> public network to the private network.
>
> I know that other pools are using a similar type of setup and so there
> must be a solution to this problem. I am not currently using a
> network file system, could this be the answer?
>
> -Nicholas
>
>
> On 7/18/07, *Tomas Grigera* <tgrigera@xxxxxxxxxxxxxxxxxx
> <mailto: tgrigera@xxxxxxxxxxxxxxxxxx>>
wrote:
>
> Hi,
>
> I use a similar setup. I have
>
> BIND_ALL_INTERFACES = TRUE
>
>
> But you must make sure the server name resolves to
the public IP also
> for the internal machines.
>
> Tomas
>
> On 7/17/07, Nicholas Lavigne < condor.list@xxxxxxxxxxxxxx
> <mailto:condor.list@xxxxxxxxxxxxxx>>
wrote:
> > Due to a shortage of allocated IP addresses
on our university's
> network, we
> > have decided to use the central manager
machine (running Debian)
> as a
> > gateway with two network interfaces and place
some compute nodes
> on a
> > sub-network behind it. The router
is doing its job correctly,
> but the
> > machines on the subnet do not seem to appear
in the Condor pool.
> >
> > Are there any general rules for having Condor
listen on two network
> > interfaces? Maybe some
modification to the HOSTALLOW_READ and
> > HOSTALLOW_WRITE variables on the central
manager? Currently,
> >
> > HOSTALLOW_READ = *
> > HOSTALLOW_WRITE = 134.95.*
> >
> > But I would like Condor to listen on the
192.168.10.* subnet as
> well.
> >
> > Any suggestions?
> >
> >
> > Thanks,
> > Nicholas Lavigne
> > University
of Cologne
> > Graduiertenkolleg Risikomanagement
> >
> >
> > _______________________________________________
> > Condor-users mailing list
> > To unsubscribe, send a message to
> > condor-users-request@xxxxxxxxxxx
> <mailto: condor-users-request@xxxxxxxxxxx>
with a
> > subject: Unsubscribe
> > You can also unsubscribe by visiting
> > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> >
> > The archives can be found at:
> > https://lists.cs.wisc.edu/archive/condor-users/
> >
> >
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx
> <mailto:condor-users-request@xxxxxxxxxxx>
with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>Condor-users mailing list
>To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx
with a
>subject: Unsubscribe
>You can also unsubscribe by visiting
>
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
>The archives can be found at:
>https://lists.cs.wisc.edu/archive/condor-users/
>
>
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx
with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/