[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] AFS & Condor



Erik Paulson wrote:
Franco - this is basically the configuration we have at UW Computer Sciences.
Once diffence is that we have a /afs/../condor/common for arch-independent
things, like the Perl modules.

Very good, so I'm not so weird in using afs... :)

You need to tell Condor a user to run as when it tries to drop it's root
privileges. A condor daemon can be one of three "users"
1. The root user 2. The condor user
3. The actual user who submitted a job

It would be better not to run condor as root... if possible. The thing is, I don't want to create a condor user, so I'm thinking about letting the user start the service if needed, and run as ``daemon'' user.


If the condor daemons are started as root (ie a uid that can switch to
other uids) then each of those "users" need to be different uids. If you
start the condor daemons as a non-root user, then we don't try and switch
UIDs - all three users are always the same.

Mmmh... let me understand. In my network, we have afs and kerberos 5. We rely for authentication just on kerberos (MIT). So, every user will get its uid from ldap which matches the pts and user:group (chowned all home dirs).


Do you have any concerns about this configuration? Should be any problem?

When you start as root, the UIDs involved are:

1. The root user is uid 0
2. The condor user is whatever uid we can find in the password file for user 'condor' - UNLESS you put in your config or environment CONDOR_IDS=x.y,
where x and y are the UID and GID condor should use instead of looking in the password file. The most common case for this is using the UID and GID
for the unix user 'daemon'

Good, this would be what I wanted.

3. The actual user depends on what we're doing, of course- on the submit side,
   when we want to write a file as a user we use seteuid() to be that user and
   write out the file. If we're executing a job, we look to see if we're in
   the same UID_DOMAIN as the submit machine, and if we are, we switch to the
   uid of the submitting user, otherwise we switch to 'nobody'

Will it work for kerberos principals? I've set acl for a scratch space under afs for all the machines dia:all rlidwk (dia is my dept.)


My concern is about read and write permissions. Let me make a small example.

I have a script that a user can use to start the cluster, gaining KerberosV tickets and AFS tokens on all the machines. He has three options for his jobs:

- /afs/cell/public/space     dia:all rlidwk
- /afs/cell/usr/username...  depends
- /data/                     LOCAL directory present on every machine

What do you suggest? I'd go for the first solution... it's the easiest in my opinion... maybe problems will occur on locks?

In computer sciences, we have a local directory on every machine that is
the condor user home directory (/var/home/condor) - in there, we have
a symlink that points /var/home/condor/condor_config to /afs/cs.wisc.edu/unsup/condor/etc/condor_config.

Ok.

In /afs/.../etc/condor_conifg, we define
LOCAL_CONFIG_FILE = /afs/cs.wisc.edu/unsup/condor/etc/hosts/$(HOSTNAME).local

Uh... ok. Is it really important? I have 8 identical machines... the osx ones will come later. I don't think I have to make local config files, they should be just the same.


in $(HOSTNAME).local - we define a $(LOCAL_DISK) to be the right place for
that machine (usually /scratch/condor, but not always), and then we setup a
bunch of paths to be:

SBIN = /afs/.../condor/sbin
BIN = /afs/.../condor/bin
LIB = /afs/.../condor/lib
LOG = $(LOCAL_DISK)/log
SPOOL = $(LOCAL_DISK)/spool
EXECUTE = $(LOCAL_DISK)/execute

Sure, that's what I would do. So you think /data/condor, local dir, present on all machines should be the right choice?


that way, we keep most of the condor install in AFS, but machine-specific log files stay on the machine.

If you have AFS, you should be able to look around in /afs/cs.wisc.edu/unsup/condor/ and see exactly how we've got it setup - we
use @sys pretty heavily, we split our config files up into 6 different
files, so we can customize things globally, on a per-platform, and a per-host
basis.

Good. I'll mount your cell and take a look if it's permitted...

Only put things out in AFS that you want to update once and have everything
pick up on it, or things that change only once in a great while. Don't
put your EXECUTE, SPOOL, LOG, or LOCK directories in AFS.

Ok. I'll go for /data/condor for everything. Jobs will be started by afs users in their home directory. Is it good? Should they use instead /data/condor?


--
Franco Milicchio <mailto:milicchio@xxxxxxxx>

No keyboard found. Press F1 to continue...
(Almost every BIOS available in this world... even yours!)