[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Partitionable Slots and ConcurrencyLimits



Hi, all

 

Concurrency limits work very well.

On my test deployment – they work perfectly, limit users, all is wonderful.

But – the moment I try to translate this into the production environment, concurrency limits go out the window.

 

I set a MARCV_LIMIT=1 and invoke a "queue 100" job with ConcurrenyLimit=Owner in test – 1 runs and rest wait.

I do the same in production - slowly, more and more processes creep in and run. Sometimes, it stops around 5-8. Sometimes, at 50. Sometimes all 100 get running.

 

There are two difference between the test cluster (4 cast-off machines) and the production (70+ machines) – size and partitionable slots (which are used in production).

 

Has anyone been able to make a partitionable slot setup work reliably with ConcurrencyLimits and is willing to share his experience/ideas?

 

Many thanks!

 

Marc Volovic