We are having problems getting jobs submitted from
a linux submit host to a windows lab behind a gateway. On the windows
machine, we have errors in the starter log as follows:
0/3 19:10:35 Communicating with shadow <22.214.171.124:37473>
10/3 19:10:35 Submitting machine is
10/3 19:12:34 condor_read(): recv() returned -1, errno = 10054, assuming
failure reading 5 bytes from <126.96.36.199:55548>.
10/3 19:12:34 ERROR "Assertion ERROR on (result)" at line 113 in file
10/3 19:12:34 ERROR "LocalUserLog::logStarterError() called before
init()" at line 205 in file ..\src\condor_starter.V6.1\local_user_log.C
On the submit node, in the shadow log,
0/3 19:16:58 Initializing a VANILLA shadow for job 85.0
10/3 19:17:18 (85.0) (13769): condor_read(): timeout reading 5 bytes from
10/3 19:17:18 (85.0) (13769): Request to run on
<188.8.131.52:1050> was ACCEPTED
10/3 19:18:06 (85.0) (13769): condor_read(): timeout reading 5 bytes from
10/3 19:19:16 (85.0) (13769): condor_read(): recv() returned -1, errno =
104, assuming failure reading 5 bytes from unknown source.
10/3 19:19:16 (85.0) (13769): ERROR "Can no longer talk to condor_starter
<184.108.40.206:1050>" at line 123 in file NTreceivers.C
We have put in holes in the gateway so that there is communication
between the lab and the submit host and the central manager. We can ping
between these machines without any problems and the collector gathers
information about the available machines. However, there is something special
about the submit-execute communication that seems to be blocked by the
gateway. If the gateway is opened up, everything works fine.
Is there anything we can change to condor or to the gateway to make this
Thanks for your time.
Academic Information and Communication Technologies
Alberta, CANADA T6G
This communication is intended for the use of the
recipient to which it
is addressed, and
contain confidential, personal, and/or
privileged information. Please contact us
you are not the
intended recipient of this
you are not the intended
of this communication, do not copy, distribute, or take
action on it.
error, or subsequent reply, should be deleted