[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Windows XP computer matched but idle



Last week I added a dual CPU Windows XP computer to the HTCondor pool, but have not successfully run jobs on it. The output of condor_status shows the state as Matched, but activity is Idle when the other CPUs are Busy. The output from condor_q lists a number of CPUs "reject your job because of their own requirements" and the CPUs of this WinXP computer account for two of them and the others are the controller's and not available. The controller's SchedLog shows entries like

11/19/12 12:22:51 (pid:2128) condor_read() failed: recv(fd=484) returned -1, errno = 10054 , reading 5 bytes from startd slot1@xxxxxxxxxx <remote ip:1074> for me.
11/19/12 12:22:51 (pid:2128) IO: Failed to read packet header
11/19/12 12:22:51 (pid:2128) Response problem from startd when requesting claim slot1@xxxxxxxxxx <remote ip:1074> for me 706.420. 11/19/12 12:22:51 (pid:2128) Failed to send REQUEST_CLAIM to startd slot1@xxxxxxxxxx <remote ip:1074> for me: CEDAR:6004:failed reading from socket 11/19/12 12:22:51 (pid:2128) Match record (slot1@xxxxxxxxxx <remote up:1074> for me, 706.420) deleted

I didn't see any references to that error number, 10054, in the list's archives. What does the error mean or what is going on here? The computer is apparently communicating its presence to the pool's controller or it wouldn't be listed in the first place.

These computers have the Windows firewall disabled but are running MS Endpoint Protection, but I haven't had any problems with other four WinXP computers in the pool. I have a, hopefully separate, problem getting a Win7 computer to be recognized by HTCondor (well, it appeared once following a condor restart).