[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Job in Ideal State



Hi,

The Job is in ideal state and I am getting this error in (SCHEDD_DEBUG  = D_FULLDEBUG) SchedLog.

I am not sure what might be the problem.

Thanks,

Senthil

 

 

9/28 11:16:46 In checkContactQueue(), args = 0x8ec52e0, host=< xxx.xxx.xx.xx:9612>

9/28 11:16:46 In Scheduler::contactStartd()

9/28 11:16:46 <xxx.xxx.xx.xx:9612>#1159394643#4 Senthil@xxxxxxxxxxx < xxx.xxx.xx.xx:9612> 96.0

9/28 11:16:49 -------- Begin starting jobs --------

9/28 11:16:49 match (<xxx.xxx.xx.xx:9612>#1159394643#4) waiting for connection

9/28 11:16:49 -------- Done starting jobs --------

9/28 11:17:32 attempt to connect to < xxx.xxx.xx.xx:45995> timed out

9/28 11:17:32 In Scheduler::startdContactConnectHandler

9/28 11:17:32 Got mrec data pointer 0x8ef3040

9/28 11:17:32 Failed to connect to startd < xxx.xxx.xx.xx:9612>

9/28 11:17:32 Called send_vacate( < xxx.xxx.xx.xx:9612>, 443 )

9/28 11:17:52 select returns 0, connect failed

9/28 11:17:52 Will keep trying for 20 seconds...

9/28 11:17:53 Connect failed for 21 seconds; returning FALSE

9/28 11:17:53 ERROR: SECMAN:2003:TCP connection to < xxx.xxx.xx.xx:9612> failed

 

9/28 11:17:53 Sent RELEASE_CLAIM to startd on < xxx.xxx.xx.xx:9612>

9/28 11:17:53 Match record (<xxx.xxx.xx.xx:9612>, 96, 0) deleted

9/28 11:17:53 ClaimId of deleted match: < xxx.xxx.xx.xx:9612>#1159394643#4

9/28 11:17:53 DaemonCore: Command received via TCP from host < xxx.xxx.xx.xx:46002>

9/28 11:17:53 DaemonCore: received command 1111 (QMGMT_CMD), calling handler (handle_q)

9/28 11:17:53 condor_read(): Socket closed when trying to read buffer

9/28 11:17:53 IO: EOF reading packet header

9/28 11:17:53 QMGR Connection closed

9/28 11:18:52 Getting monitoring info for pid 5042