[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] schedd problem




Janzen Brewer wrote:
The problem:
I can't seem to get submitted jobs to run from the public network. When
I run condor_q -analyze #.#, I get the familiar "Of 72 machines, ... 72
match but reject the job for unknown reasons". The odd thing is that
jobs submitted from the private network (i.e. from a compute machine) do
run. There wasn't anything very interesting in the logs on the CMs or
compute machines, but I did find this in SchedLog on the public network
machine I submitted the job from:


7/8 16:04:47 (pid:24838) Can't find address for startd
jhb-579.stuff.gatech.edu

This indicates that your schedd has not heard from the negotiator in a long time, so it is trying to find a local startd that it can claim directly without negotiating for it.

So the question is why the negotiator is not connecting to your schedd. Does your schedd show up in the collector?

condor_status -schedd

Does the negotiator log mention that it is trying and failing to connect to your schedd?

--Dan