[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] No Starter found to Run



Look on the compute nodes at the StartLog, that will give
you some more clues.. If you don't see anything
set the logging of the startd and starter to D_FULLDEBUG.
I've never seen this problem happen before but it should be possible
to figure out with the right debug level.

Steve Timm


On Tue, 18 Jun 2013, Vishal Shah wrote:

Hello,

I am having an issue configuring condor to run jobs. Currently there is a
node that acts as a head node from which the jobs are submitted and where
all of the nodes in the compute unit can be seen; however, when submitting
a job, the logs in the head node indicate that the job has been submitted,
but the job does not run. The queue shows that the job has been submitted;
however the state is perpetually pending. The following is a snippet of
the StartLog on the compute node:

06/18/13 14:07:19 slot1: Received match
<10.144.6.164:9532>#1371557953#351#...
06/18/13 14:07:19 slot1: State change: match notification protocol
successful
06/18/13 14:07:19 slot1: Changing state: Unclaimed -> Matched
06/18/13 14:07:19 slot1: No starter found to run this job!  Is something
wrong with your Condor installation?
06/18/13 14:07:19 slot1: Request to claim resource refused.
06/18/13 14:07:19 slot1: State change: claiming protocol failed
06/18/13 14:07:19 slot1: Changing state: Matched -> Owner
06/18/13 14:07:19 slot1: State change: IS_OWNER is false
06/18/13 14:07:19 slot1: Changing state: Owner -> Unclaimed

Does anybody have insight into this issue?

Thanks,
Vishal



------------------------------------------------------------------
Steven C. Timm, Ph.D  (630) 840-8525
timm@xxxxxxxx  http://home.fnal.gov/~timm/
Fermilab Computing Division, Scientific Computing Facilities,
Grid Facilities Department, FermiGrid Services Group, Group Leader.
Lead of FermiCloud project.