[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] starter failed to connect to collector



Hi,

I have a few machines in a 30 node winXP pool that refuse to start jobs.
I see these in the starter log:

...
10/14 14:59:38 vm2: Received match <192.168.0.162:1353>#4441711918
10/14 14:59:38 vm2: State change: match notification protocol successful
10/14 14:59:38 vm2: Changing state: Unclaimed -> Matched
10/14 15:01:38 vm2: State change: match timed out
10/14 15:01:38 vm2: Changing state: Matched -> Owner
10/14 15:01:38 vm2: State change: IS_OWNER is false
10/14 15:01:38 vm2: Changing state: Owner -> Unclaimed
...
10/14 15:04:50 DaemonCore: Command received via TCP from host
<192.168.0.98:3484>
10/14 15:04:50 DaemonCore: received command 442 (REQUEST_CLAIM), calling
handler (command_request_claim)
10/14 15:04:50 Error: can't find resource with capability
(<192.168.0.162:1353>#4441711918)
....

and also (after reboot):
10/14 16:32:17 Error sending update to the collector : Failed to connect
to collector  
10/14 16:32:17 vm2: Error sending update to collector(s)
10/14 16:34:18 bind failed: WSAError = 10049


Any suggestions where to start? wipedisk? (its been a long day...)

Yours,
pdev. 

********************************DISCLAIMER****************************
The information contained in the above e-mail message or messages 
(which includes any attachments) is confidential and may be legally 
privileged.  It is intended only for the use of the person or entity 
to which it is addressed.  If you are not the addressee any form of 
disclosure, copying, modification, distribution or any action taken 
or omitted in reliance on the information is unauthorised.  Opinions 
contained in the message(s) do not necessarily reflect the opinions 
of the Queensland Government and its authorities.  If you received 
this communication in error, please notify the sender immediately and 
delete it from your computer system network.