[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] condor 6.8.2 + RHEL 4 - jobs stay idle, never run



Of course, I missed an expression in condor_config when I sent this.

	APPEND_REQUIREMENTS    = ( \
        	MY.RESOURCE_GROUP == TARGET.JOB_GROUP \
	)

>I've found a box that does have better-analyze available:
>
>( target.NikolaHost == "noddy" ) &&
>( ( MY.RESOURCE_GROUP == TARGET.JOB_GROUP ) ) && ( target.Arch == "INTEL" ) &&
>( target.OpSys == "LINUX" ) && ( target.Disk >= DiskUsage ) &&
>( ( target.Memory * 1024 ) >= ImageSize ) &&
>( TARGET.FileSystemDomain == MY.FileSystemDomain )
>
>    Condition                         Machines Matched    Suggestion
>    ---------                         ----------------    ----------
>1   ( ( MY.RESOURCE_GROUP == TARGET.JOB_GROUP ) )0                   REMOVE
>2   ( target.NikolaHost == "noddy" )  1                    
>3   ( target.Arch == "INTEL" )        364                  
>4   ( target.OpSys == "LINUX" )       377                  
>5   ( target.Disk >= 10000 )          385                  
>6   ( ( 1024 * target.Memory ) >= 10000 )385                  
>7   ( TARGET.FileSystemDomain == "ee.washington.edu" )
>                                      385                  
>
>
>This is exactly the same set up as the (working) 6.6.10 inplementation.
>The following four lines are in /etc/condor/condor_config:
>
>        RESOURCE_GROUP = "ssli"
>        JOB_GROUP = "ssli"
>        SUBMIT_EXPRS = JOB_GROUP
>        STARTD_EXPRS = RESOURCE_GROUP
>
>
>The requirement part of the condor_config is:
>
>        IS_ALLOWED =  ( \
>                MY.RESOURCE_GROUP == TARGET.JOB_GROUP || \
>                MY.RESOURCE_GROUP == TARGET.USER_GROUP || \
>                MY.RESOURCE_GROUP == "ssli" \
>        )
>
>        IS_LOCAL =  ( \
>                MY.RESOURCE_GROUP == TARGET.JOB_GROUP || \
>                MY.RESOURCE_GROUP == TARGET.USER_GROUP \
>        )
>
>        START = $(UWCS_START) && $(IS_ALLOWED)
>        RANK = $(IS_LOCAL)
>
>
>
>"ssli" or "vlsi" or "mtml", etc is filled in by the script that installs
>the condor_config on the host.
>
>When I remove the NikolaHost requirement this particular box actually
>sends jobs to the 6.6.10 pool just fine.  Noddy is a 32-bit system
>running RHEL 4 with Condor 6.8.2.  The boxes that are not sending jobs
>out at all are 64-bit boxes so I can understand why they would
>not be sending jobs to the 32-bit 6.6.10 systems.
>
>What I don't understand is why this requirement works in 6.6.10 but not
>in 6.8.2.
>
>nomad
>
>>> 6.8.2:
>>> 
>>>       Requirements = (START) && (IsValidCheckpointPlatform)
>>
>>IsValidCheckpointPlatform is automatically inserted by the startd, but 
>>it should evaluate to true for any vanilla job.  What does condor_q 
>>-better-analyze say?
>>
>>-Greg

nomad