[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Condor acts weird

Title: Condor acts weird

We have setup a condor cluster with 13 machines, 8 nodes each machine and on an attemp to run we have the following problem:

When a 1 user submits from one 1 machine we get max 48 nodes Claimed out of 104 nodes in total. Most of the nodes get Matched and then timeout and return to unclaimed. This continues on and on no matter how big is the jobs or match we increase the Match timeout variable.

When multiple users submit from multiple machines then all nodes get Claimed without any problems.

We have been banging our heads for few days and couldn’t figure it out. Is it possible that it’s a network problem, saturation, and we need to talk with the administrator to look it up?

Thank you,