[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] jobs fail to run, with "Warning: Found no submitters"

Hello.  I've been struggling with a problem that is basically identical to the
one described in this post from last year:


The problem is that I can submit jobs, but whatever jobs are submitted are
rejected by all available nodes.

My cluster consists of one dual-cpu head node, and three diskless client nodes:

~> condor_status

Name          OpSys       Arch   State      Activity   LoadAv Mem   ActvtyTime

node1.cluster LINUX       X86_64 Unclaimed  Idle       0.950   435[?????]
node2.cluster LINUX       X86_64 Unclaimed  Idle       1.120   435  0+00:53:42
node3.cluster LINUX       X86_64 Unclaimed  Idle       1.000   435  0+01:00:47
vm1@xxxxxxxxx LINUX       X86_64 Owner      Idle       1.000  1002  4+20:07:37
vm2@xxxxxxxxx LINUX       X86_64 Unclaimed  Idle       0.210  1002  0+00:00:00

                     Machines Owner Claimed Unclaimed Matched Preempting

        X86_64/LINUX        5     1       0         4       0          0

               Total        5     1       0         4       0          0

The Condor setup is very simple, pretty much default.  The head node has the
following condo_config.local file:


and the other nodes are using the
<release_dir>/etc/examples/condor_config.local.dedicated.resource file which
specifies the DedicatedScheduler as the head node.

I have made a single executable to calculate pi to 10000 digits (which works
fine normally), which I am trying to submit with the following command file:

Executable = pi2
output = pi2.out 
Log = pi2.log                                                    
Universe = vanilla

The result is the following:

~> condor_q -analyze
Warning:  Found no submitters

-- Submitter: zajos.cluster : <> : zajos.cluster
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
012.000:  Run analysis summary.  Of 5 machines,
      0 are rejected by your job's requirements
      3 reject your job because of their own requirements
      0 match but are serving users with a better priority in the pool
      2 match but reject the job for unknown reasons
      0 match but will not currently preempt their existing job
      0 are available to run your job

1 jobs; 1 idle, 0 running, 0 held

Does any one have any idea what's going wrong.  I'm wondering what types of
misconfigurations to look for, or ways in which I can more specifically debug
what's going on.  Unfortunately the tread mentioned above ended with a phone
call instead of a posting to the list.  Any help would be most appreciated.