[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] incomplete udp for command 0 / command 2



Hello,

After monitoring condor udp traffic for a while, I've found an interesting
problem.. sometimes clients will start misbehaving, and send only part of the data needed for update_master_ad / update_startd_ad commands.

Has anyone seen this? Any ideas on what's causing it?

I've attached a tcpdump snippet showing one client, 10.92.25.15, sending complete master updates, but incomplete startd updates, while another client, 10.92.25.26, is doing the opposite. This continued for several hours, with the exact same packet patterns. I don't think it's simply a matter of UDP packets being lost, because it's not random at all. I first started looking into this because clients sending incomplete startd packets won't appear in a condor_status listing, presumably because the collector ignores broken updates.

If I'm misinterpreting anything here, please correct me. :)

Both clients are windows XP, condor version 6.8.6. I've noticed this behavior on both, sometimes one, sometimes the other. The problem is gone after a condor_restart, but will eventually re-occur. The client logfiles don't show anything interesting.

Any ideas on how to debug / fix this would be welcome.

Thanks,

Rob de Graaf

Attachment: collector.dump
Description: tcpdump collector