Mailing List Archives
Public Access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] -better-analyze doesn't tell me details (7.0.4)
- Date: Thu, 6 Nov 2008 16:13:53 +0100
- From: Steffen Grunewald <steffen.grunewald@xxxxxxxxxx>
- Subject: Re: [Condor-users] -better-analyze doesn't tell me details (7.0.4)
On Thu, Nov 06, 2008 at 08:54:44AM -0600, Steven Timm wrote:
> At one time the Condor staff told me that you will
> only get the list of requirements that your job has got,
> if you have a non-zero value of machines that are rejected
> by your jobs requirements.
>
> The ways to get at the "reject the job for unknown reasons"
> are to do condor_q -ana -l 227322
This means I have to ask for the reasons for the whole job cluster -?
> That will tell you the last machine that rejected your match, and why.
> NegotiatorLog can sometimes tell you something too if you are running
> at high enough debug.
Actually, I set a Memory requirement that is only fulfilled by a couple of
machines (which are in single-slot configuration, as opposed to the two-core
boxes which offer two slots).
I get
227324.028: Run analysis summary. Of 1200 machines,
1174 are rejected by your job's requirements
0 reject your job because of their own requirements
0 match but are serving users with a better priority in the pool
26 match but reject the job for unknown reasons
0 match but will not currently preempt their existing job
0 are available to run your job
The Requirements expression for your job is:
( ( target.Memory > 1500 ) ) && ( target.Arch == "X86_64" ) &&
( target.OpSys == "LINUX" ) && ( target.Disk >= DiskUsage ) &&
( TARGET.FileSystemDomain == MY.FileSystemDomain )
Condition Machines Matched Suggestion
--------- ---------------- ----------
1 ( ( target.Memory > 1500 ) ) 26
2 ( target.Arch == "X86_64" ) 1200
3 ( target.OpSys == "LINUX" ) 1200
4 ( target.Disk >= 2500 ) 1200
5 ( TARGET.FileSystemDomain == "$domain" )
1200
slot2@node599 Failed request constraint
where node599 is one of the 2-slot ones :(
What would be the order of matching machines? (There are a few beyond 600,
in particular the single slot ones which are Unclaimed and Idle.)
> Two "unknown reasons" I've hit before are (a) the negotiation cycle
> just hasn't happened yet since this job was submitted and (b)
> the user in question has exceeded his group quota.
(a): I have waited for hours, and other jobs got scheduled
(b): group quota aren't in use
Some more ideas?
Steffen
--
Steffen Grunewald * MPI Grav.Phys.(AEI) * Am M�erg 1, D-14476 Potsdam
Cluster Admin * http://pandora.aei.mpg.de/merlin/ * http://www.aei.mpg.de/
* e-mail: steffen.grunewald(*)aei.mpg.de * +49-331-567-{fon:7233,fax:7298}
No Word/PPT mails - http://www.gnu.org/philosophy/no-word-attachments.html