On 05/20/2013 02:53 PM, Brian Candler wrote:
On Mon, May 20, 2013 at 02:38:31PM -0400, Dan Shea wrote:
Adding STARTD to the gatekeeper node caused all jobs queued to be
executed on the gatekeeper.
It seems the gatekeeper machine can not see the execute-only nodes?
I'm not sure what I have missed in the configuration to cause this
behaviour?  Network wise they all see each other just fine, hostnames
resolved via /etc/hosts entries.
Have you set ALLOW_WRITE, if so to what?

Currently, I am attempting to limit things to the local network, perhaps this is not the correct way to wildcard a subnet?

ALLOW_WRITE = 10.11.114.*

SchedLog:05/17/13 13:41:21 (pid:9037) WARNING: forward resolution of
localhost.localdomain doesn't match!
This does look like a problem. What does "hostname" show on all the nodes?
Do you have a "localhost.localdomain" entry in /etc/hosts? Normally it would
be for, don't be tempted to set it to the external IP of your

hostname will return node00 - node09 depending upon which node you are on.  /etc/hosts localhost.localdomain entry has not been modified, it still points to loopback.  I think I do see the issue however.   node00 localhost localhost.localdomain node00

Thanks Brian, let me correct the /etc/hosts entries and see if it fixes things a bit.


