[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Condor-users] jobs fail to run, with "Warning: Found no submitters"



> Hello.  I've been struggling with a problem that is basically
identical to
> the one described in this post from last year:
> 
> https://lists.cs.wisc.edu/archive/condor-users/pre-2004-
> June/msg01340.shtml
> 
> The problem is that I can submit jobs, but whatever jobs are submitted
are
> rejected by all available nodes.
> 
> My cluster consists of one dual-cpu head node, and three diskless
client
> nodes:
> 
> The Condor setup is very simple, pretty much default.  The head node
has
> the following condo_config.local file:
> 
> ------------------------
> NETWORK_INTERFACE = 10.0.0.1
> DAEMON_LIST = COLLECTOR, MASTER, NEGOTIATOR, SCHEDD, STARTD
> ------------------------
> 
> and the other nodes are using the
> <release_dir>/etc/examples/condor_config.local.dedicated.resource file
> which specifies the DedicatedScheduler as the head node.
> 
> I have made a single executable to calculate pi to 10000 digits (which
> works fine normally), which I am trying to submit with the following 
> command file:

> ~> condor_q -analyze
> Warning:  Found no submitters
> 
> -- Submitter: zajos.cluster : <10.0.0.1:44160> : zajos.cluster
>  ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
> ---
> 012.000:  Run analysis summary.  Of 5 machines,
>       0 are rejected by your job's requirements
>       3 reject your job because of their own requirements
>       0 match but are serving users with a better priority in the pool
>       2 match but reject the job for unknown reasons
>       0 match but will not currently preempt their existing job
>       0 are available to run your job
> 
> 1 jobs; 1 idle, 0 running, 0 held
> ------------------------
> 
> Does any one have any idea what's going wrong.

Some suggestions:
- Turn up the level of logging and see what's in the schedd log,
collector log, and negotiator log.  See 

http://docs.optena.com/display/CONDOR/How+To+Increase+Debugging+Messages

This should help track down the 'Found no submitters' error.  The schedd
ought to be sending information about submitters (users like you that
have submitted jobs) to the collector, and this information goes to the
negotiator.  condor_q pulls this info from the negotiator.

- You say that you have three diskless machines - condor may be thinking
that they have no disk space, and therefore can't run jobs.  Try
'condor_status -l | grep Disk' to see what your machines are
advertising.
Try condor_q -l to see your Requirements string and DiskUsage.  There
probably is a clause like ' && (Disk >= DiskUsage)' in the Requirements,
and this could be preventing jobs from starting on those machines.

To disable this safety feature, you'll have to set something like

Requirements = (Disk >= 0)

in your submit file.

Mike Yoder
Principal Member of Technical Staff
Ask Mike: http://docs.optena.com
Direct  : +1.408.321.9000
Fax     : +1.408.321.9030
Mobile  : +1.408.497.7597
yoderm@xxxxxxxxxx

Optena Corporation
2860 Zanker Road, Suite 201
San Jose, CA 95134
http://www.optena.com



> Thanks.
> 
> jamie.
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users