[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] new to HTCONDOR & some dumb questions :(



Hi,

as stated above I am new to HTCONODOR and have some issues with a test installation that I can not seem to clear by myself. 

I run a pool with 5 nodes and one submit node. Everything is quite 'default' but when I am submitting a bunch of 'loop.remote' jobs, being alone on my pool I would think I can flood the whole thing but I never get more than 32 jobs running at any given time. 

Using condor_q I see that some slots seem to reject my job due to their own requirements (?) 

Is there any place I can look for these requirements if it's quota relate e.g (while actually there is no quota set) 

[chbeyer@$HOST]~% condor_q -better 179.9494


-- Submitter: bm-test.desy.de : <$IP:58611> : $HOST.desy.de
---
179.9494:  Request has not yet been considered by the matchmaker.

User priority for chbeyer@xxxxxxx is not available, attempting to analyze without it.
---
179.9494:  Run analysis summary.  Of 53 machines,
      0 are rejected by your job's requirements 
     42 reject your job because of their own requirements 
      0 match and are already running your jobs 
      1 match but are serving other users 
     10 are available to run your job

The Requirements expression for your job is:

    ( TARGET.Arch == "X86_64" ) && ( TARGET.OpSys == "LINUX" ) &&
    ( ( CkptArch == TARGET.Arch ) || ( CkptArch is undefined ) ) &&
    ( ( CkptOpSys == TARGET.OpSys ) || ( CkptOpSys is undefined ) ) &&
    ( TARGET.Disk >= RequestDisk ) && ( TARGET.Memory >= RequestMemory )

Your job defines the following attributes:

    DiskUsage = 7500
    ImageSize = 7500
    RequestDisk = 7500
    RequestMemory = 8

The Requirements expression for your job reduces to these conditions:

         Slots
Step    Matched  Condition
-----  --------  ---------
[0]          53  TARGET.Arch == "X86_64"
[1]          53  TARGET.OpSys == "LINUX"
[4]          53  CkptArch is undefined
[8]          53  CkptOpSys is undefined
[11]         53  TARGET.Disk >= RequestDisk
[13]         53  TARGET.Memory >= RequestMemory

Suggestions:

    Condition                         Machines Matched    Suggestion
    ---------                         ----------------    ----------
1   ( TARGET.Arch == "X86_64" )       53                   
2   ( TARGET.OpSys == "LINUX" )       53                   
3   ( ( CkptArch == TARGET.Arch ) || ( CkptArch is undefined ) )
                                      53                   
4   ( ( CkptOpSys == TARGET.OpSys ) || ( CkptOpSys is undefined ) )
                                      53                   
5   ( TARGET.Disk >= 7500 )           53                   
6   ( TARGET.Memory >= ifthenelse(MemoryUsage isnt undefined,MemoryUsage,8) )
                                      53                   

The following attributes are missing from the job ClassAd:

CheckpointPlatform

In the logs I see a lot of errors like this: 

04/22/15 10:58:15 (pid:8111) OwnerCheck(condor_pool) failed in SetAttribute for job 179.1257
04/22/15 10:58:15 (pid:8111) OwnerCheck(condor_pool) failed in SetAttribute for job 179.1257

Any hints much appreciated !!! 

best regards
        ~christoph


-- 
/*   Christoph Beyer     |   Office: Building 2b / 23     *\
 *   DESY                |    Phone: 040-8998-2317        *
 *   - IT -              |      Fax: 040-8994-2317        *
\*   22603 Hamburg       |     http://www.desy.de         */