[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] 7.8.3: "rejected by your job requirements" again



Hi all,

after 7.8.3 update I have 4 jobs out of a dag of 8000+ stuck with:
---------------------------------------------
The Requirements expression for your job is:

( ( TARGET.Memory > 0 ) && ( .RIGHT.Memory > 0 ) ) &&
( TARGET.Arch == "X86_64" ) && ( TARGET.OpSys == "LINUX" ) &&
( TARGET.Disk >= RequestDisk ) && ( TARGET.Memory >= RequestMemory ) &&
( TARGET.FileSystemDomain == MY.FileSystemDomain )

    Condition                         Machines Matched    Suggestion
    ---------                         ----------------    ----------
1   (
    [
    ].Memory > 0 )       0                   REMOVE
2   ( TARGET.Memory >= 7325 )         0                   MODIFY TO 1968
3   ( TARGET.Memory > 0 )             32
...
---------------------------------------------

Last time condor pulled this TARGET.Memory requirement out of the ether
I added "( TARGET.Memory > 0 ) && ( .RIGHT.Memory > 0 )" to job's submit
file. That worked until now.

The other change is I added another machine to the pool in the middle of
the run -- a 2x2 AMD, but stuck jobs are not on it.

What's curious this time all 4 jobs are stuck on one node and before
they got stuck a whole lot of jobs successfully ran to completion on
that node.

The jobs are BLAST sequence searches, execute nodes are all centos 6.3
x86_64 AMDs (2..8-core), the whole setup's been running weekly for years.

Any suggestions?

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu

Attachment: signature.asc
Description: OpenPGP digital signature