|We are having problems getting jobs submitted from a linux submit host to a windows lab behind a gateway. On the windows machine, we have errors in the starter log as follows:|
0/3 19:10:35 Communicating with shadow <220.127.116.11:37473>
10/3 19:10:35 Submitting machine is "opteron-cluster.nic.ualberta.ca"
10/3 19:12:34 condor_read(): recv() returned -1, errno = 10054, assuming failure reading 5 bytes from <18.104.22.168:55548>.
10/3 19:12:34 ERROR "Assertion ERROR on (result)" at line 113 in file ..\src\condor_starter.V6.1\NTsenders.C
10/3 19:12:34 ERROR "LocalUserLog::logStarterError() called before init()" at line 205 in file ..\src\condor_starter.V6.1\local_user_log.C
On the submit node, in the shadow log,
0/3 19:16:58 Initializing a VANILLA shadow for job 85.0
10/3 19:17:18 (85.0) (13769): condor_read(): timeout reading 5 bytes from <22.214.171.124:1050>.
10/3 19:17:18 (85.0) (13769): Request to run on <126.96.36.199:1050> was ACCEPTED
10/3 19:18:06 (85.0) (13769): condor_read(): timeout reading 5 bytes from <188.8.131.52:1050>.
10/3 19:19:16 (85.0) (13769): condor_read(): recv() returned -1, errno = 104, assuming failure reading 5 bytes from unknown source.
10/3 19:19:16 (85.0) (13769): ERROR "Can no longer talk to condor_starter <184.108.40.206:1050>" at line 123 in file NTreceivers.C
We have put in holes in the gateway so that there is communication between the lab and the submit host and the central manager. We can ping between these machines without any problems and the collector gathers information about the available machines. However, there is something special about the submit-execute communication that seems to be blocked by the gateway. If the gateway is opened up, everything works fine.
Is there anything we can change to condor or to the gateway to make this work?
Thanks for your time.
fujinaga@xxxxxxxxxxx Tel.: (780) 492-2117 Fax.: (780) 492-1729
Research Computing Support
Academic Information and Communication Technologies (AICT)
University of Alberta, Edmonton, Alberta, CANADA T6G 2H1
This communication is intended for the use of the recipient to which it is addressed, and may
contain confidential, personal, and/or privileged information. Please contact us immediately
if you are not the intended recipient of this communication. If you are not the intended recipient
of this communication, do not copy, distribute, or take action on it. Any communication received
in error, or subsequent reply, should be deleted or destroyed