[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Access Problems

On Sat, Aug 24, 2013 at 05:39:45PM -0700, Andrey Kuznetsov wrote:
> Hi,
> We have a couple of computers running in a cluster on master machine with
> subdomain XXX.
> We have a machine with subdomain YYY connected via a direct network on
> secondary network card, so there's an internal network between XXX and YYY
> machines.
> The internal network between XXX and YYY is indicated in the /etc/hosts file
> as:
> XXX.ucsc.edu XXX
> and
> YYY.ucsc.edu YYY
> on their respective machines
> We also have computers such as ZZZ, and etc which do not have internal network
> to XXX, the main pool machine.
> All the machines are on ucsc.edu domain name, UID_DOMAIN, FILESYSTEM_DOMAIN are
> set to it.
> the ALLOW READ and WRITE are set to *.ucsc.edu and 10.0.0.*
> Machines like ZZZ and XXX can submit and run jobs fine, because they are
> allowed access.
> My problem is with machine YYY which has an internal network setup with machine
> XXX.
> What is happening is that YYY talks to XXX over internal network because of the
> hosts file, and machine XXX tries to authenticate machine YYY.

Actually, it appears that they are NOT talking over the internal network, as
shown in the error message below.

> It first does a forward name resolution of YYY.ucsc.edu which turns out to be
> because of the /etc/hosts file.
> Then it does a reverse DNS lookup on YYY.ucsc.edu and returns an external IP
> address of that machine.

"Reverse DNS" means mapping an IP address back to a hostname.

Also, in Condor, it starts with the IP address that the request came in on,
and first does a reverse lookup.  For security reasons, it then does a forward
lookup of that hostname if it exists to see if it matches the original IP or
its aliases.  So the above hypothetical scenario doesn't really apply here.

> PERMISSION DENIED to unauthenticated@unmapped from host 128.114.###.YYY for
> command 1111 (QMGMT_READ_CMD), access level READ: reason: READ authorization
> policy contains no matching ALLOW entry for this request; identifiers used for
> this host: 128.114.###.YYY, hostname size = 0, original ip address = 128.114.##
> #.YYY

This shows that the request came in to the schedd on the public IP address.

If you do a "condor_status -schedd -long" and look at the resulting output, do
you see public or private IP addresses?  This is essentially how the condor_q
tool locates schedd it wants to communicate with.  (Feel free to send the
output to me off-list if you want me to look it)