[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] condor_store_credd -c add prevents jobs from running



Hi,

I had the same problem as Jeffrey in the same scenario (see Mails below).

I tried to fix it using

condor_store_cred –c add.

 

After that, when submitting a job from slave, the following problem occurred:

No credential stored for Tom@SLAVE

 

But

condor_store_cred add

complains:

make sure your HOSTALLOW_WRITE setting includes this host.

 

This surprises me, since HOSTALLOW_WRITE and HOSTALLOW_CONFIG are set to * on all machines.

Does anyone have a hint?

 

Best regards,

Tom Paschenda

 


 

When your job tries to start, it probably uses a shared pool password to authenticate against the credd. Did you set the shared pool password on all machines?

 

condor_store_credd -c add

 

 

Mike

 


From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Jeffrey Stephen
Sent: 08 March 2007 06:40
To: condor-users@xxxxxxxxxxx
Subject: [Condor-users] jobs don't run when using condor_credd

Hi,

 

I am trying to set up condor_credd on Windows XP. I have a central manager machine (nes30700) and one submit/execute (ie. slave) machine (nes15300). The slave machine is configured to always run jobs:

 

=================================================================

> condor_status

 

Name          OpSys       Arch   State      Activity   LoadAv Mem   ActvtyTime

 

vm1@NES30700. WINNT51     INTEL  Owner      Idle       0.040  1023  0+00:05:15
vm2@NES30700. WINNT51     INTEL  Owner      Idle       0.000  1023  0+00:05:16
nes15300.land     WINNT51     INTEL  Unclaimed  Idle       -0.010  1022  0+00:09:55

=================================================================

 

To run jobs I had to use "condor_store_cred" to set my password. I did this on both the central manager and slave manager. (Is that correct?)

Once that was done, I could successfully run a test program using condor_submit.

 

I want to use a shared filesystem, so I tried to set up condor_credd. I did the following:

1. copied the example file (etc/condor_config.local.credd) into condor_config.local in the condor main directory on both the central manager and the slave machines;

2. added the following lines to the condor_config file (on both the central manager and the slave machines):

    STARTER_ALLOW_RUNAS_OWNER = True
    CREDD_HOST = nes30700.lands.resnet.qg
    CREDD_CACHE_LOCALLY = True
    SEC_CLIENT_AUTHENTICATION_METHODS = NTSSPI, PASSWORD

3. Modified condor_config file (on both the central manager and the slave machines):

   COLLECTOR_NAME = QCCCE_condor

   where "QCCCE_condor" is the name of my condor pool

4. started condor on both the central manager and the slave machines (using net start condor)

The condor_master, condor_collector, condor_credd, condor_negotiator, condor_schedd and condor_startd) daemons started on both machines. I thought condor_negotiator and condor_collector were only supposed to run on the central manager machine, but they were running on the both the central manager and the slave machine.

5. added "run_as_owner = true" to the job config file