[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] jobs won't run: MY.Rank > MY.CurrentRank



Sorry .. I forgot to mention version 6.6.10.
central manager is Solaris 9

The classad is typical of one of the idle machines..
It seems to be all ready to run a job .. but nothing happens.

I don't know how to further diagnose this problem.

Is there some way I can get more information about WHY a job is rejected by 
this particular machine ?

Andrew


depot  mel% condor_status -l CO-328-001-C.student.vuw.ac.nz
MyType = "Machine"
TargetType = "Job"
Name = "CO-328-001-C.student.vuw.ac.nz"
Machine = "CO-328-001-C.student.vuw.ac.nz"
Rank = 0.000000
CpuBusy = ((LoadAvg - CondorLoadAvg) >= 0.500000)
COLLECTOR_HOST_STRING = "depot.mcs.vuw.ac.nz"
CondorVersion = "$CondorVersion: 6.6.10 Jun 22 2005 $"
CondorPlatform = "$CondorPlatform: INTEL-WINNT50 $"
VirtualMachineID = 1
VirtualMemory = 2295048
Disk = 36573004
CondorLoadAvg = 0.000000
LoadAvg = 0.000000
KeyboardIdle = 1287880
ConsoleIdle = 1287880
Memory = 1014
Cpus = 1
StartdIpAddr = "<130.195.7.232:1164>"
Arch = "INTEL"
OpSys = "WINNT51"
UidDomain = "vuw.ac.nz"
FileSystemDomain = "vuw.ac.nz"
Subnet = "130.195.7"
HasIOProxy = TRUE
TotalVirtualMemory = 2295048
TotalDisk = 36573004
KFlops = 670149
Mips = 2120
LastBenchmark = 1133980551
TotalLoadAvg = 0.000000
TotalCondorLoadAvg = 0.000000
ClockMin = 610
ClockDay = 4
TotalVirtualMachines = 1
HasFileTransfer = TRUE
HasMPI = TRUE
HasJICLocalConfig = TRUE
HasJICLocalStdin = TRUE
JavaVendor = "Sun Microsystems Inc."
JavaVersion = "1.5.0"
JavaMFlops = 167.911209
HasJava = TRUE
StarterAbilityList = 
"HasFileTransfer,HasMPI,HasJICLocalConfig,HasJICLocalStdin,HasJava"
CpuBusyTime = 0
CpuIsBusy = FALSE
State = "Unclaimed"
EnteredCurrentState = 1133965958
Activity = "Idle"
EnteredCurrentActivity = 1133980551
Start = TRUE
Requirements = START
CurrentRank = 0.000000
DaemonStartTime = 1128371348
UpdateSequenceNumber = 19047
MyAddress = "<130.195.7.232:1164>"
LastHeardFrom = 1133989897
UpdatesTotal = 15535
UpdatesSequenced = 15534
UpdatesLost = 527
UpdatesHistory = "0x00000000000000000000000000000000"


On Thu, 08 Dec 2005 10:08, Andrew Mellanby wrote:
> Hi,
>
> We've been installing condor all over campus, now we have over 400
> WindowsXP machines with it.
> BUT ... I can't submit any jobs :( , at least I was running 200 jobs
> yesterday, this morning there are 200 more machines and NO jobs will start.
>
>
> depot  mel% condor_q -analyze 220.7 -long
>
> -- Submitter: depot.mcs.vuw.ac.nz : <130.195.6.11:55312> :
> depot.mcs.vuw.ac.nz CC-HM01-002-C.s Failed rank condition: MY.Rank >
> MY.CurrentRank
> ---
> 220.007:  Run analysis summary.  Of 414 machines,
>       7 are rejected by your job's requirements
>      37 reject your job because of their own requirements
>       6 match, but are serving users with a better priority in the pool
>     364 match, match, but reject the job for unknown reasons
>       0 match, but will not currently preempt their existing job
>       0 are available to run your job
>
> Also:
>
> depot  mel% condor_q -analyze 223.499 -long
>
> -- Submitter: depot.mcs.vuw.ac.nz : <130.195.6.11:55312> :
> depot.mcs.vuw.ac.nz kk-218-003-c.st Failed offer constraint
> ---
>
> the only requirements the job has are:
> Requirements = (OpSys == "WINNT51") && (Arch == "INTEL") && (JavaMFlops >
> 50) and at least 400 machines meet those requirements.
>
> Any clues whats going on here ?
>
> Thanks
>
> Andrew
>
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users