[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Security instructions for basic condor cluster



On Thu, Apr 7, 2011 at 11:23 AM, Dan Bradley <dan@xxxxxxxxxxxx> wrote:


On 4/4/11 10:20 PM, Brian O'Meara wrote:
Hi, all. I am trying a small Condor install in my lab group with the eventual goal of expanding it to our department. I've installed Condor version 7.5.6 on Mac OS 10.6.7 on both a desktop (manager, execute, submit) and a laptop (execute, submit) [I tried using the stable build, but had an issue with ssh access to my desktop stopping working, so decided to try the dev build instead]. Basically, I think I need a pointer to a good guide to setting up proper authentication (yes, I have RTFM and also Googled). My immediate problem (and this might just be an indicator of my broader issues) is that when I do condor_submit on the laptop (non-manager computer), I get:

   Submitting job(s)
   ERROR: Failed to create cluster

And when looking at the Schedlog on the manager for this time, see

   04/04/11 20:52:34 (pid:52253) IPVERIFY: unable to resolve IP address of omearalab6.local

The above message likely means that $(FULL_HOSTNAME) is included somewhere in your ALLOW settings, and since the hostname doesn't have a DNS record, Condor is failing to turn that into an IP address.  You can confirm that with something like the following command:

condor_config_val -dump | grep -i ALLOW

That failure is not critical as long as some other ALLOW setting matches the IP address of the connection from condor_submit.  Are there any other lines in the log that could help indicate what is going wrong?  Is authentication failing?  Is authorization failing?


Thank you, Dan. Between the original email and your reply,I tried reinstalling condor and reconfiguring it, and now I do not get that error: I changed the ALLOW_WRITE from

manager;submit;execute:    ALLOW_WRITE = bomeara@*/127.0.0.1, 127.0.0.1, omearalab5.ee.utk.edu, omearalab6.local
submit;execute    ALLOW_WRITE = 127.0.0.1, omearalab5.ee.utk.edu, omearalab8.ee.utk.edu, omearalab3.dyndns.org, omearalab6.dyndns.org

To the less secure:
ALLOW_WRITE = *.utk.edu, 160.36.*

and it seems to work (just for completeness, output of all my ALLOW* configs [ condor_config_val -dump | grep -i ALLOW ] is:

ALLOW_ADMINISTRATOR = $(CONDOR_HOST)
ALLOW_NEGOTIATOR = $(CONDOR_HOST)
ALLOW_NEGOTIATOR_SCHEDD = $(CONDOR_HOST), $(FLOCK_NEGOTIATOR_HOSTS)
ALLOW_OWNER = $(FULL_HOSTNAME), $(ALLOW_ADMINISTRATOR)
ALLOW_READ = *
ALLOW_READ_COLLECTOR = $(ALLOW_READ), $(FLOCK_FROM)
ALLOW_READ_STARTD = $(ALLOW_READ), $(FLOCK_FROM)
ALLOW_WRITE = *.utk.edu, 160.36.*
ALLOW_WRITE_COLLECTOR = $(ALLOW_WRITE), $(FLOCK_FROM)
ALLOW_WRITE_STARTD = $(ALLOW_WRITE), $(FLOCK_FROM)

Now, my manager/submit/execute can successfully run jobs on itself and on the submit/execute node. [There's now a problem where the submit/execute node submits jobs and shows them on its own condor_q, but they don't show up on condor_q on the manager and so aren't run, but I still have to search the web for this issue -- I imagine it's arisen and been solved before, so it's not worth bothering the list yet].

This still uses host-based authentication, however, and pretty coarse-grained one at that. Can you, or someone else, point to a basic but thorough password-based security tutorial? Past condor week presentations touch on this but only briefly, and I haven't been able to find much other information. I'll try adding this again with the pool password info based on the manual and the OSG info and will let the list know how it goes (I'll post a detailed tutorial if it works).

Thank you again,
Brian