Janzen Brewer wrote:
The problem: I can't seem to get submitted jobs to run from the public network. When I run condor_q -analyze #.#, I get the familiar "Of 72 machines, ... 72 match but reject the job for unknown reasons". The odd thing is that jobs submitted from the private network (i.e. from a compute machine) do run. There wasn't anything very interesting in the logs on the CMs or compute machines, but I did find this in SchedLog on the public network machine I submitted the job from:
7/8 16:04:47 (pid:24838) Can't find address for startd jhb-579.stuff.gatech.edu
This indicates that your schedd has not heard from the negotiator in a long time, so it is trying to find a local startd that it can claim directly without negotiating for it.
So the question is why the negotiator is not connecting to your schedd. Does your schedd show up in the collector?
condor_status -scheddDoes the negotiator log mention that it is trying and failing to connect to your schedd?
--Dan