[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Computers missing from Condor pool
- Date: Fri, 07 Mar 2008 15:50:39 +0100
- From: Rob de Graaf <r.degraaf@xxxxxxxxxxxx>
- Subject: Re: [Condor-users] Computers missing from Condor pool
Daniel Forrest wrote:
There are some other things to look at with UDP. Monitor the output
of "netstat -su" looking at "packet receive errors". If this number
is going up then you are losing packets.
As it turns out, we were getting ~10% UDP loss during peak hours. We've
increased the kernel buffer limits, and haven't lost a packet since.
Do a "condor_status -l | condor_updates_stats | grep "Stats:" And
check for lost updates.
The change to the UDP buffer has decreased the percentage of lost
updates as shown by condor_updates_stats by quite a bit; mostly 0-2%
lost updates with some spikes at 10%, compared to 10-30% all round
before the buffer increase. While this is definitely an improvement,
we're still not satisfied with the number of hosts in the pool; ping
sweeps still show some 20% additional live hosts the collector doesn't
Is there anything else we could try?
Rob de Graaf