[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Connection problem



Hi,

I'm trying to get a Windows condor master/worker working with a linux 
box (just submitting jobs). The windows box is actually running inside 
VMWare.

The linux box is running a schedd, and I'm able to run a condor_submit 
command which puts the job in the local queue. Only the job doesn't 
start. A condor_status fails to, indicating that the collector doesn't 
respond. Here is a par of the Schedd Log :


11/24 12:25:01 (pid:8775) attempt to connect to <xx.xx.xx.xx:9618>
failed: Invalid argument (connect errno = 22).  Will keep trying for 20
total seconds (20 to go).

11/24 12:25:21 (pid:8775) attempt to connect to <xx.xx.xx.xx:9618>
failed: Invalid argument (connect errno = 22).
11/24 12:25:21 (pid:8775) ERROR: SECMAN:2003:TCP connection to
<xx.xx.xx.xx:9618> failed

11/24 12:25:21 (pid:8775) Failed to start non-blocking update to
<xx.xx.xx.xx:9618>.
11/24 12:26:00 (pid:8775) get_file: Zero-length file check failed!
11/24 12:26:00 (pid:8775) Failed to receive file from client in
SendSpoolFile.
11/24 12:26:39 (pid:8775) DaemonCore: Command received via UDP from host
<127.0.1.1:32788>
11/24 12:26:39 (pid:8775) DaemonCore: received command 421 (RESCHEDULE),
calling handler (reschedule_negotiator)
11/24 12:26:39 (pid:8775) attempt to connect to <xx.xx.xx.xx:9618>
failed: Invalid argument (connect errno = 22).  Will keep trying for 20
total seconds (20 to go).

The linux box can ping the windows box (with its full hostname) and the 
windows firewall is opened for port 9618, I can connect through this 
port via telnet.
I don't know what is wrong. Does someone have a clue on this ?

-- 
Matthieu Cargnelli
EADS CCR - Centre de Toulouse
Centreda 1
4, Avenue Didier Daurat
31700 BLAGNAC
Tel: (+33 5) 67.19.61.73