[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] AFS & Condor

On Fri, Feb 18, 2005 at 03:43:29PM -0600, Franco Milicchio wrote:
> Hi.
> I'm planning to install condor on an AFS cell. What I'm planning to do, 
> is this:
> - /afs/cell.name/proj/condor -> /afs/cell.name/proj/condor.@sys
> - install condor (linux 2.4) in /afs/cell.name/proj/condor.i386_linux24
> - install condor (linux 2.6) in /afs/cell.name/proj/condor.i386_linux26
> - install condor (macosx) in /afs/cell.name/proj/condor.ppc_darwin_70
> - have all symlinks for bin, sbin & co on @sys
> Every user using an afs account will have rights to read and write on
> /afs/cell.name/proj/condor-jobs, acl based on ip addresses will allow 
> writing on this volume.
> Now the questions :)
> Has anyone a configuration like this one?

Franco - this is basically the configuration we have at UW Computer Sciences.
Once diffence is that we have a /afs/../condor/common for arch-independent
things, like the Perl modules.

> Is it necessary a condor user? I don't think so...

You need to tell Condor a user to run as when it tries to drop it's root
privileges. A condor daemon can be one of three "users"
1. The root user 
2. The condor user
3. The actual user who submitted a job

If the condor daemons are started as root (ie a uid that can switch to
other uids) then each of those "users" need to be different uids. If you
start the condor daemons as a non-root user, then we don't try and switch
UIDs - all three users are always the same. 

When you start as root, the UIDs involved are:

1. The root user is uid 0
2. The condor user is whatever uid we can find in the password file for 
   user 'condor' - UNLESS you put in your config or environment CONDOR_IDS=x.y,
   where x and y are the UID and GID condor should use instead of looking in 
   the password file. The most common case for this is using the UID and GID
   for the unix user 'daemon'
3. The actual user depends on what we're doing, of course- on the submit side,
   when we want to write a file as a user we use seteuid() to be that user and
   write out the file. If we're executing a job, we look to see if we're in
   the same UID_DOMAIN as the submit machine, and if we are, we switch to the
   uid of the submitting user, otherwise we switch to 'nobody'

> How to install condor on AFS with the minimum amount of waste? I mean, 
> when I will install it on linux, I will do it first on the job 
> submitting machine... How to make the other linux machines aware of 
> condor if I share binaries and configurations?

In computer sciences, we have a local directory on every machine that is
the condor user home directory (/var/home/condor) - in there, we have
a symlink that points /var/home/condor/condor_config to 

In /afs/.../etc/condor_conifg, we define
LOCAL_CONFIG_FILE = /afs/cs.wisc.edu/unsup/condor/etc/hosts/$(HOSTNAME).local

in $(HOSTNAME).local - we define a $(LOCAL_DISK) to be the right place for
that machine (usually /scratch/condor, but not always), and then we setup a
bunch of paths to be:

SBIN = /afs/.../condor/sbin
BIN = /afs/.../condor/bin
LIB = /afs/.../condor/lib

that way, we keep most of the condor install in AFS, but machine-specific 
log files stay on the machine.

If you have AFS, you should be able to look around in 
/afs/cs.wisc.edu/unsup/condor/ and see exactly how we've got it setup - we
use @sys pretty heavily, we split our config files up into 6 different
files, so we can customize things globally, on a per-platform, and a per-host

> I'm quite confused using AFS... but it's the smartest way of sharing 
> data across multiple machines...

Only put things out in AFS that you want to update once and have everything
pick up on it, or things that change only once in a great while. Don't
put your EXECUTE, SPOOL, LOG, or LOCK directories in AFS.