[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] NEGOTIATE_ALL_JOBS_IN_CLUSTER
- Date: Wed, 28 Jan 2009 09:23:26 -0600
- From: Todd Tannenbaum <tannenba@xxxxxxxxxxx>
- Subject: Re: [Condor-users] NEGOTIATE_ALL_JOBS_IN_CLUSTER
Steffen Grunewald wrote:
For a homogeneous pool, and "simple" job clusters (identical specs for all
jobs) NEGOTIATE_ALL_JOBS_IN_CLUSTER is suggested to be set to False.
On the other hand, there may be situations where the first job of a single
cluster continues to fail (for whatever reason: memory overcommit comes to
mind) thus blocking all others.
Hi Steffen -
What version of Condor are you working with?
Starting back w/ Condor v7.0.x and above, the default built-in auto
clustering mechanism in Condor should prevent the situations you
describe above --- and do so in a much more efficient/scalable manner
than setting NEGOTIATE_ALL_JOBS_IN_CLUSTER to TRUE (which is the kiss of
performance death if you have thousands of jobs).
Is it possible to - e.g. once per given time period (4 hours?) - "flush"
the queue by temporarily setting the macro to True?
Maybe something else is going on? With Condor v7.0.x and above with the
default auto-clustering, I assert you should never have to resort to
NEGOTIATE_ALL_JOBS_IN_CLUSTER = True. Are you over-riding
autoclustering in your config file by expliciting setting
SIGNIFICANT_ATTRIBUTES or some such on your condor_config on your submit
Todd Tannenbaum University of Wisconsin-Madison
Condor Project Research Department of Computer Sciences
tannenba@xxxxxxxxxxx 1210 W. Dayton St. Rm #4257