[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Windows Condor problems with credd and executing jobs as submitting user



Hello all,

Alas, Alex [FEDI] wrote:
Robert,

I had the same issue but I fixed it by rearranging the daemons’ starting order, placing the condor_cred.exe before the condor_startd.exe in the daemons’ list at the condor_config level.

This reliance on startup order is most certainly a bug in Condor. We have a fix in the works, but it won't be available until the 7.2.1 release (7.2.0 is nearly out the door).

A way to workaround the problem (aside from relying on startup order, which is still not foolproof since there's a race condition involved) is to issue a pool-wide reconfig once the whole pool is up an running. "condor_reconfig -all" from the central manager should do the trick. This will cause all execute machines to retest their communication with the CredD, which in turn should result in the LocalCredd attribute in the machine ads assuming everything is configured properly.

In Condor 7.2.1, this testing of communication with the CredD will happen periodically instead of only on startup and reconfig.

Greg

If you want I can send you a document with the exact lines I changed in the configuration file of the execute\submit machines to make it work. But if you want to know more why the problem exist, and I know you would please refer to the following e-mails in the condormail list:

1) On 7/9/2008, Matthew Farrellee wrote:

“When the condor_startd starts it launches a condor_starter with a special argument (-classad). The condor_starter prints out a class ad that the condor_startd then advertises. If the condor_credd starts after the condor_starter -classad is run, you'll be missing the LocalCredd. This could be fixed by having the condor_startd periodically run condor_starter -classad, if that isn't already happening.”

2) On 12/11/2008 Cooper Thompson wrote:

The first step (based on your problem description) is to get localcredd to show up in the classad of each of your machines. In my configuration, we do this by running "condor_store_cred -c add <password>" on each machine in the pool, then restarting the Condor services (note that condor_credd.exe must be up and running on your CM before condor_startd.exe comes up on your nodes, otherwise it won't pick set localcredd).

3) On 12/16/2008 I wrote:

Cooper,

You pointed me in the right direction to follow. The step of entering the condor_pool's password and user's password was done on all condor machines as soon as I included the condor_cred configuration in the condor_config file of all the computers but that wasn't enough to make the jobs to run because the localcred setting was missing on all the machines but one. So what I did to fix the problem (based on your suggestion) was to rearrange the starting order of the daemons placing condor_cred after condor_master and before condor_startd. By doing that I reassured the localcredd's classad is submitted to all the execute\submit machines. This action fixed my issue and jobs are been executed on all the execute machines using the user's credentials.

Thanks lots for your help,

Alex

Alex

*From:* condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] *On Behalf Of *Robert Hecker
*Sent:* Wednesday, December 17, 2008 7:51 AM
*To:* Condor-Users Mail List
*Subject:* Re: [Condor-users] Windows Condor problems with credd and executing jobs as submitting user


Hi Greg,

Thank you very much for the detailed Information inside the pdf File.
I had realy the problem that the condor _status -f .... command
listed my job_execution machine, and so i updated the pool passwords.

Anyway it's not working now.

If i use the condor_q -analyze command I will get the following output:
==========================================================================
-- Submitter: submitter.test.mydomain.com : <172.20.201.19:1078> : submitter.test.
mydomain.com
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
---
016.000:  Run analysis summary.  Of 2 machines,
      2 are rejected by your job's requirements
      0 reject your job because of their own requirements
      0 match but are serving users with a better priority in the pool
      0 match but reject the job for unknown reasons
      0 match but will not currently preempt their existing job
      0 are available to run your job
        No successful match recorded.
        Last failed match: Wed Dec 17 13:35:58 2008
        Reason for last match failure: no match found

WARNING:  Be advised:
   No resources matched request's constraints
   Check the Requirements expression below:

Requirements = (Arch == "INTEL") && (OpSys == "WINNT51") && (Disk >= DiskUsage) && ((Memory * 1024) >= ImageSize) && (HasFileTransfer) && (HasWindowsRunAsOwner
&& (LocalCredd =?= "jobcontroller:9620"))


1 jobs; 1 idle, 0 running, 0 held
============================================================================

When I updated the pool password on the execution machine, i recognized, that
I'm able to set the password

=============================================================================
C:\condor\bin>condor_store_cred -c add
Account: condor_pool@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Enter password:

Operation succeeded.
==============================================================================

but if i want to query the password, it will not work:
==============================================================================
C:\condor\bin>condor_store_cred -c query
Account: condor_pool@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Operation failed.
    Make sure your HOSTALLOW_WRITE setting includes this host.
==============================================================================

My Question now is, Where must the settings HOSTALLOW_WRITE done?

On the Jobexecutor, Submitter or Controller ?

I made the settings on all 3 machines like following:

HOSTALLOW_WRITE = *.test.mydomain.com

but it is not working. (I also restarted all 3 machines!)

Any help is welcome.

Robert



*Greg Quinn <gquinn@xxxxxxxxxxx>*
Gesendet von: condor-users-bounces@xxxxxxxxxxx

16.12.2008 17:07

Bitte antworten an
Condor-Users Mail List <condor-users@xxxxxxxxxxx>

	

An

	

Condor-Users Mail List <condor-users@xxxxxxxxxxx>

Kopie

	

Thema

	

Re: [Condor-users] Windows Condor problems with credd and executing jobs as submitting user

	




Hi all,

Many discussions on this list surround folks having trouble getting the
Windows "run_as_owner" feature working by setting up a CredD. I have
just finished rewriting the related section of our manual to give more
of a self-contained HOWTO style introduction. I'm hoping it will make it
easier for people to get the CredD set up in the future.

The new text will be in the some-to-be-released 7.2.0 version of the
manual, but in the mean time I've placed a 2-page PDF with the relevant
section (6.2.4) here:

http://pages.cs.wisc.edu/~gquinn/run_as_owner.pdf

Please, check it out if you're a user of Condor on Windows. I'm happy to
incorporate suggestions.

Thanks!

Greg Quinn
Condor Team

 Hello everybody,

 I want to use condor to get the Power of the HighThroughputComputing.
 But it seems very hard to get Condor running.

Actually all Condor machines are installed, I can submit jobs, but the
jobs will never be
executed. I think it depends on an wrong configuration because i want
to use network access
 and try to run the jobs under the submitted user.

I want to use condor in a windows domain, and I started to set up
following machines:
        -1 condor controller machine
        -1 condor submitter machine
        -1 condor execution machine

 I use condor version 7.0.5.
I want to use run the jobs under an "real" user account, to get access
to special network files on an
 File Server.

I used the help from site
http://ben.versionzero.org/wiki/Condor_Authentication
and the Presentation called "quinn_windows_tutorial.ppt" to get the
condor setup working, but without
 success.

 Have someone a idea, what's going wrong here ?
 Where can I look next to get more information, to find the mistake?

When i installed condor, i put on every machine the pool password,
with the commands

 condor_store_cred add -c -n executionmachine.test.mydomain.com
 condor_store_cred add -c -n submitmachine.test.mydomain.com
 condor_store_cred add -c -n controller.test.mydomain.com

 I Used here the password "xyz" which is no domain password.

 after that i was on the submit machine and typed

 "condor_store_cred add" where condor ask after an Passsword for User@test
I typed in my password, and that was all. (This password was my domian
password)

After that i submitted my job.sub File which was tested on an default
Condor installation
 (without execute as submit user)(this worked...)

 job.sub:
 ========

 Universe   = vanilla
 Executable = job.bat
 Arguments  = 4 12
 Log        = simple.log.txt
 Output     = simple.out.txt
 Error      = simple.err.txt

 run_as_owner = true

 Queue



 But nothing happend. This means, when i check the status with condor_q
 i will see the job in the queue, but they will be idle.

 Did I made some configuration wrong?
 Or did I set up some passwords wrong?

It would be great, if someone has an idea, what i have to to to get
condor running.

 Thank you very much for your help.
 Every advice would be helpfull.

 Robert



 Here are my configurations:

The condor_config File of the Controller has following changes to the
original:
 ========================================================

 LOCAL_CONFIG_FILE = $(LOCAL_DIR)/condor_config.local \
                     $(LOCAL_DIR)/condor_config.local.credd

 HOSTALLOW_CONFIG = Submitmachine.test.mydomain.com

 And the condor_config.local.credd of the Controller looks like this:
 ================================================
 ######################################################################
 ##
 ##  condor_config.credd
 ##
 ##  This is the default local configuration file for the machine
 ##  running the condor_credd.  You should copy this file to the
 ##  appropriate location and customize it for your needs.
 ##
 ######################################################################

 ## Note: The following settings will need to be present in your
 ## global config file:
 ##
 ##   CREDD_HOST = my-credd.cs.wisc.edu
 ##   STARTER_ALLOW_RUNAS_OWNER = True
 ##   CREDD_CACHE_LOCALLY = True
 ##
 ## You'll also need to ensure that clients are configured to use
 ## PASSWORD authentication on any machine that can run jobs as the
 ## submitting user. For example,
 ##
 ##   SEC_CLIENT_AUTHENTICATION_METHODS = NTSSPI, PASSWORD

 ## CREDD_SETTINGS

 ## CREDD logging settings
 ## Customize these if you wish.
 CREDD_LOG = $(LOG)/CreddLog
 CREDD_DEBUG = D_COMMAND
 MAX_CREDD_LOG = 50000000

 #################################################
 ## CREDD Expert settings
 ## Everyting below is for the UBER-KNOWLEDGEABLE only!
 ## Do not change these unless you know what you do!
 #################################################


 DAEMON_LIST = $(DAEMON_LIST), CREDD
 #DC_DAEMON_LIST = \
 #MASTER, STARTD, SCHEDD, KBDD, COLLECTOR, NEGOTIATOR, EVENTD, \
 #VIEW_SERVER, CONDOR_VIEW, VIEW_COLLECTOR, HAWKEYE, CREDD, HAD, \
 #QUILL

 CREDD    = $(SBIN)/condor_credd.exe

 # Timeout session quickly since we normally only get contacted
 # once per starter
 SEC_CREDD_SESSION_TIMEOUT = 10


 # Set security settings so that full security to the credd is required
 CREDD.SEC_DEFAULT_AUTHENTICATION =REQUIRED
 CREDD.SEC_DEFAULT_ENCRYPTION = REQUIRED
 CREDD.SEC_DEFAULT_INTEGRITY = REQUIRED
 CREDD.SEC_DEFAULT_NEGOTIATION = REQUIRED

 # Require PASSWORD auth for password fetching
 CREDD.SEC_DAEMON_AUTHENTICATION_METHODS = PASSWORD

 # Only honor password fetch requests to the trusted "condor_pool" user
 CREDD.ALLOW_DAEMON = condor_pool@$(UID_DOMAIN)

 # Require NTSSPI for storing credentials
 CREDD.SEC_DEFAULT_AUTHENTICATION_METHODS = NTSSPI

 The Submit machine has following condor_config:
 ====================================
 LOCAL_CONFIG_FILE = $(LOCAL_DIR)/condor_config.local \
                     $(LOCAL_DIR)/condor_config.local.submit.execute

 HOSTALLOW_CONFIG = Submitmachine.test.mydomain.com

 CREDD_HOST  = $(CONDOR_HOST):$(CREDD_PORT)

The file condor_config.local.submit.execute File from the Submit
machine looks like:
 =============================================================

 ######################################################################
 ##
 ##  condor_config.local.submit.execute
 ##
 ##  This is the default local configuration file for the submit machine
 ##  and execute machine.
 ##
 ######################################################################

 ## Note: The following settings will need to be present in your
 ## global config file:
 STARTER_ALLOW_RUNAS_OWNER = True
 CREDD_CACHE_LOCALLY = True
 ##
 ## You'll also need to ensure that clients are configured to use
 ## PASSWORD authentication on any machine that can run jobs as the
 ## submitting user. For example,
 ##
 SEC_CLIENT_AUTHENTICATION_METHODS = NTSSPI, PASSWORD

 And the condor_config File from the Execution machine looks like:
 =================================================

 LOCAL_CONFIG_FILE = $(LOCAL_DIR)/condor_config.local \
                     $(LOCAL_DIR)/condor_config.local.submit.execute

 HOSTALLOW_CONFIG = Submitmachine.test.mydomain.com

 CREDD_HOST  = $(CONDOR_HOST):$(CREDD_PORT)

 And the condor_config.local.submit.execute File from the
 Execution machine is the same file like this one from the Submitmachine.

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/


------------------------------------------------------------------------

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: https://lists.cs.wisc.edu/archive/condor-users/