condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Tom Paschenda
Sent: Saturday, 24 March 2007 1:06
condor_store_credd -c add prevents jobs from running
I had the same problem as Jeffrey in the same
scenario (see Mails below).
I tried to fix it using
condor_store_cred –c add.
After that, when submitting a job from slave, the
following problem occurred:
No credential stored for
make sure your
HOSTALLOW_WRITE setting includes this host.
This surprises me, since HOSTALLOW_WRITE and
HOSTALLOW_CONFIG are set to * on all machines.
Does anyone have a hint?
When your job tries to
start, it probably uses a shared pool password to authenticate against the credd.
Did you set the shared pool password on all machines?
condor_store_credd -c add
condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Jeffrey Stephen
Sent: 08 March 2007 06:40
Subject: [Condor-users] jobs don't
run when using condor_credd
I am trying to set up condor_credd on Windows XP. I
have a central manager machine (nes30700) and one submit/execute (ie. slave)
machine (nes15300). The slave machine is configured to always run jobs:
State Activity LoadAv Mem
WINNT51 INTEL Owner
Idle 0.040 1023 0+00:05:15
WINNT51 INTEL Owner
Idle 0.000 1023 0+00:05:16
INTEL Unclaimed Idle
-0.010 1022 0+00:09:55
To run jobs I had to use
"condor_store_cred" to set my password. I did this on both the
central manager and slave manager. (Is that correct?)
Once that was done, I could successfully run a test
program using condor_submit.
I want to use a shared filesystem, so I tried to set
up condor_credd. I did the following:
1. copied the example file
(etc/condor_config.local.credd) into condor_config.local in the condor main
directory on both the central manager and the slave machines;
2. added the following lines to the condor_config
file (on both the central manager and the slave machines):
STARTER_ALLOW_RUNAS_OWNER = True
CREDD_HOST = nes30700.lands.resnet.qg
CREDD_CACHE_LOCALLY = True
SEC_CLIENT_AUTHENTICATION_METHODS = NTSSPI, PASSWORD
3. Modified condor_config file (on both the central
manager and the slave machines):
COLLECTOR_NAME = QCCCE_condor
where "QCCCE_condor" is the
name of my condor pool
4. started condor on both the central manager and the
slave machines (using net start condor)
The condor_master, condor_collector, condor_credd,
condor_negotiator, condor_schedd and condor_startd) daemons started on both
machines. I thought condor_negotiator and condor_collector were only supposed
to run on the central manager machine, but they were running on the both the
central manager and the slave machine.
5. added "run_as_owner = true" to the job