[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] lost updates / network issues?



Howdy All,

I'm hoping maybe someone can give me some advice about how to diagnose a problem with our pool.

We're running a test pool with a handful of resources. Condorview is showing that resources are sometimes appearing and disappearing (see attached screenshot) - though I've only noticed this rarely with condor_status. There is no specific reason for resources to join and leave - apart form network issues perhaps...

In addition, condor_status shows me that a lot of updates are being lost - sometimes around 1/4 (see below).

Hence, i've got 2 questions:

- is this amount of updates lost cause for concern? Machines are on a busy student network. Should I be upping the rate at which updates occur?
- why might condorview be showing me that resources are entering and leaving the pool? is this cause for concern?

Regards,

James

UpdatesTotal = 4725
UpdatesSequenced = 4793
UpdatesLost = 1028

UpdatesTotal = 5151
UpdatesSequenced = 5148
UpdatesLost = 366

UpdatesTotal = 4636
UpdatesSequenced = 4612
UpdatesLost = 916

UpdatesTotal = 3688
UpdatesSequenced = 3630
UpdatesLost = 1175

UpdatesTotal = 5214
UpdatesSequenced = 5213
UpdatesLost = 361

UpdatesTotal = 5202
UpdatesSequenced = 5201
UpdatesLost = 1471

UpdatesTotal = 5221
UpdatesSequenced = 5220
UpdatesLost = 284

Attachment: screen.jpg
Description: JPEG image