[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] new to HTCONDOR & some dumb questions :(



Hi Christoph,

Is it possible that a particular machine is not running any jobs (but others are)?  There are various security things that could have gone wrong in the config.

Other things to check:
1) Look in the NegotiatorLogs.
2) Try "condor_q -better -reverse <ID>" to see whether a given machine matches your job (as opposed to jobs matching a machine).  Recall that matching is bidirectional: machines must like the job and the job must like the machine.

Brian

> On Apr 22, 2015, at 4:01 AM, Beyer, Christoph <christoph.beyer@xxxxxxx> wrote:
> 
> 
> Hi,
> 
> as stated above I am new to HTCONODOR and have some issues with a test installation that I can not seem to clear by myself. 
> 
> I run a pool with 5 nodes and one submit node. Everything is quite 'default' but when I am submitting a bunch of 'loop.remote' jobs, being alone on my pool I would think I can flood the whole thing but I never get more than 32 jobs running at any given time. 
> 
> Using condor_q I see that some slots seem to reject my job due to their own requirements (?) 
> 
> Is there any place I can look for these requirements if it's quota relate e.g (while actually there is no quota set) 
> 
> [chbeyer@$HOST]~% condor_q -better 179.9494
> 
> 
> -- Submitter: bm-test.desy.de : <$IP:58611> : $HOST.desy.de
> ---
> 179.9494:  Request has not yet been considered by the matchmaker.
> 
> User priority for chbeyer@xxxxxxx is not available, attempting to analyze without it.
> ---
> 179.9494:  Run analysis summary.  Of 53 machines,
>      0 are rejected by your job's requirements 
>     42 reject your job because of their own requirements 
>      0 match and are already running your jobs 
>      1 match but are serving other users 
>     10 are available to run your job
> 
> The Requirements expression for your job is:
> 
>    ( TARGET.Arch == "X86_64" ) && ( TARGET.OpSys == "LINUX" ) &&
>    ( ( CkptArch == TARGET.Arch ) || ( CkptArch is undefined ) ) &&
>    ( ( CkptOpSys == TARGET.OpSys ) || ( CkptOpSys is undefined ) ) &&
>    ( TARGET.Disk >= RequestDisk ) && ( TARGET.Memory >= RequestMemory )
> 
> Your job defines the following attributes:
> 
>    DiskUsage = 7500
>    ImageSize = 7500
>    RequestDisk = 7500
>    RequestMemory = 8
> 
> The Requirements expression for your job reduces to these conditions:
> 
>         Slots
> Step    Matched  Condition
> -----  --------  ---------
> [0]          53  TARGET.Arch == "X86_64"
> [1]          53  TARGET.OpSys == "LINUX"
> [4]          53  CkptArch is undefined
> [8]          53  CkptOpSys is undefined
> [11]         53  TARGET.Disk >= RequestDisk
> [13]         53  TARGET.Memory >= RequestMemory
> 
> Suggestions:
> 
>    Condition                         Machines Matched    Suggestion
>    ---------                         ----------------    ----------
> 1   ( TARGET.Arch == "X86_64" )       53                   
> 2   ( TARGET.OpSys == "LINUX" )       53                   
> 3   ( ( CkptArch == TARGET.Arch ) || ( CkptArch is undefined ) )
>                                      53                   
> 4   ( ( CkptOpSys == TARGET.OpSys ) || ( CkptOpSys is undefined ) )
>                                      53                   
> 5   ( TARGET.Disk >= 7500 )           53                   
> 6   ( TARGET.Memory >= ifthenelse(MemoryUsage isnt undefined,MemoryUsage,8) )
>                                      53                   
> 
> The following attributes are missing from the job ClassAd:
> 
> CheckpointPlatform
> 
> In the logs I see a lot of errors like this: 
> 
> 04/22/15 10:58:15 (pid:8111) OwnerCheck(condor_pool) failed in SetAttribute for job 179.1257
> 04/22/15 10:58:15 (pid:8111) OwnerCheck(condor_pool) failed in SetAttribute for job 179.1257
> 
> Any hints much appreciated !!! 
> 
> best regards
>        ~christoph
> 
> 
> -- 
> /*   Christoph Beyer     |   Office: Building 2b / 23     *\
> *   DESY                |    Phone: 040-8998-2317        *
> *   - IT -              |      Fax: 040-8994-2317        *
> \*   22603 Hamburg       |     http://www.desy.de         */
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/