[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] reject your job because of their own requirements



If set RESERVED_SWAP = 0 in condor_config file.
The jobs are still rejected and the messages in SchedLog file are:

4/11 10:49:34 DaemonCore: Command received via TCP from host <192.168.10.244:32911>
4/11 10:49:34 DaemonCore: received command 421 (RESCHEDULE), calling handler (reschedule_negotiator)
4/11 10:49:34 Sent ad to central manager for condor@xxxxxxxxxxxxxx
4/11 10:49:34 Called reschedule_negotiator()
4/11 10:49:44 Activity on stashed negotiator socket
4/11 10:49:44 Negotiating for owner: condor@xxxxxxxxxxxxxx
4/11 10:49:44 Checking consistency running and runnable jobs
4/11 10:49:44 Tables are consistent
4/11 10:49:44 Out of servers - 0 jobs matched, 2 jobs idle, 2 jobs rejected


----------------------------------------------------------------------------------

I ran two commands "condor_q -analyze" and "condor_q -l 5.0" and the messages:

----------------------------------------------------------------------------------
[root@grid ~]# condor_q -analyze

-- Submitter: grid.ancad.com : <192.168.10.244:32854> : grid.ancad.com
ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
---
005.000:  Run analysis summary.  Of 4 machines,
     0 are rejected by your job's requirements
     4 reject your job because of their own requirements
     0 match, but are serving users with a better priority in the pool
     0 match, match, but reject the job for unknown reasons
     0 match, but will not currently preempt their existing job
     0 are available to run your job
       No successful match recorded.
       Last failed match: Mon Apr 11 10:49:44 2005
       Reason for last match failure: no match found

WARNING:  Be advised:   Request 5.0 did not match any resource's constraints
----------------------------------------------------------------------------------

[root@grid ~]# condor_q -l 5.0


-- Submitter: grid.ancad.com : <192.168.10.244:32854> : grid.ancad.com
MyType = "Job"
TargetType = "Machine"
ClusterId = 5
QDate = 1112953363
CompletionDate = 0
Owner = "condor"
RemoteWallClockTime = 0.000000
LocalUserCpu = 0.000000
LocalSysCpu = 0.000000
RemoteUserCpu = 0.000000
RemoteSysCpu = 0.000000
ExitStatus = 0
NumCkpts = 0
NumRestarts = 0
NumSystemHolds = 0
CommittedTime = 0
TotalSuspensions = 0
LastSuspensionTime = 0
CumulativeSuspensionTime = 0
ExitBySignal = FALSE
CondorVersion = "$CondorVersion: 6.6.9 Mar 10 2005 $"
CondorPlatform = "$CondorPlatform: I386-LINUX_RH9 $"
RootDir = "/"
Iwd = "/home/condor/examples"
JobUniverse = 1
Cmd = "/home/condor/examples/io.remote"
MinHosts = 1
MaxHosts = 1
CurrentHosts = 0
WantRemoteSyscalls = TRUE
WantCheckpoint = TRUE
JobStatus = 1
EnteredCurrentStatus = 1112953365
JobPrio = 0
User = "condor@xxxxxxxxxxxxxx"
NiceUser = FALSE
Env = ""
JobNotification = 2
UserLog = "/home/condor/examples/io.log"
CoreSize = 0
KillSig = "SIGTSTP"
Rank = 0.000000
In = "/dev/null"
TransferIn = FALSE
Out = "io.out"
Err = "io.err"
BufferSize = 524288
BufferBlockSize = 32768
ShouldTransferFiles = "NO"
TransferFiles = "NEVER"
ImageSize = 11991
ExecutableSize = 11991
DiskUsage = 11991
Requirements = (Arch == "INTEL") && (OpSys == "LINUX") && ((CkptArch == Arch) || (CkptArch =?= UNDEFINED)) && ((CkptOpSys == OpSys) || (CkptOpSys =?= UNDEFINED)) && (Disk >= DiskUsage) && ((Memory * 1024) >= ImageSize)
FileSystemDomain = "grid.ancad.com"
PeriodicHold = FALSE
PeriodicRelease = FALSE
PeriodicRemove = FALSE
OnExitHold = FALSE
OnExitRemove = TRUE
LeaveJobInQueue = FALSE
Args = "200"
ProcId = 0
WantMatchDiagnostics = TRUE
LastRejMatchReason = "no match found"
LastRejMatchTime = 1113188084
ServerTime = 1113188164
-------------------------------------------------------------------


Best Regards,

Dennis Hsu
--------------------------
AnCAD Inc.
5F, No. 67, Sec. 1, Yonghe Rd.,
Yonghe City, Tapei County, 234
Taiwan