[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Windows Condor problems with credd and executing jobs as submitting user



Greg,
Not to disagree with your recommendation but I think I did try
condor_reconfig -all from the central manager and two or three other
condor machines but it did not work as you are stating, I will try to do
it again and see what happen,
Thanks for your input into this discussion.
Alex
-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Greg Quinn
Sent: Wednesday, December 17, 2008 10:03 AM
To: Condor-Users Mail List
Subject: Re: [Condor-users] Windows Condor problems with credd and
executing jobs as submitting user

Hello all,

Alas, Alex [FEDI] wrote:
> Robert,
> 
> I had the same issue but I fixed it by rearranging the daemons'
starting 
> order, placing the condor_cred.exe before the condor_startd.exe in the

> daemons' list at the condor_config level. 

This reliance on startup order is most certainly a bug in Condor. We 
have a fix in the works, but it won't be available until the 7.2.1 
release (7.2.0 is nearly out the door).

A way to workaround the problem (aside from relying on startup order, 
which is still not foolproof since there's a race condition involved) is

to issue a pool-wide reconfig once the whole pool is up an running. 
"condor_reconfig -all" from the central manager should do the trick. 
This will cause all execute machines to retest their communication with 
the CredD, which in turn should result in the LocalCredd attribute in 
the machine ads assuming everything is configured properly.

In Condor 7.2.1, this testing of communication with the CredD will 
happen periodically instead of only on startup and reconfig.

Greg

> If you want I can send you a 
> document with the exact lines I changed in the configuration file of
the 
> execute\submit machines to make it work.  But if you want to know more

> why the problem exist, and I know you would please refer to the 
> following e-mails in the condormail list:
> 
> 1) On 7/9/2008, Matthew Farrellee wrote:
> 
> "When the condor_startd starts it launches a condor_starter with a 
> special argument (-classad). The condor_starter prints out a class ad 
> that the condor_startd then advertises. If the condor_credd starts
after 
> the condor_starter -classad is run, you'll be missing the LocalCredd. 
> This could be fixed by having the condor_startd periodically run 
> condor_starter -classad, if that isn't already happening."
> 
> 2) On 12/11/2008 Cooper Thompson wrote:
> 
> The first step (based on your problem description) is to get
localcredd 
> to show up in the classad of each of your machines.  In my 
> configuration, we do this by running "condor_store_cred -c add 
> <password>" on each machine in the pool, then restarting the Condor 
> services (note that condor_credd.exe must be up and running on your CM

> before condor_startd.exe comes up on your nodes, otherwise it won't
pick 
> set localcredd).
> 
> 3) On 12/16/2008 I wrote:
> 
> Cooper,
> 
> You pointed me in the right direction to follow. The step of entering 
> the condor_pool's password and user's password was done on all condor 
> machines as soon as I included the condor_cred configuration in the 
> condor_config file of all the computers but that wasn't enough to make

> the jobs to run because the localcred setting was missing on all the 
> machines but one. So what I did to fix the problem (based on your 
> suggestion) was to rearrange the starting order of the daemons placing

> condor_cred after condor_master and before condor_startd. By doing
that 
> I reassured the localcredd's classad is submitted to all the 
> execute\submit machines. This action fixed my issue and jobs are been 
> executed on all the execute machines using the user's credentials.
> 
> Thanks lots for your help,
> 
> Alex
> 
>  
> 
> Alex
> 
>  
> 
> *From:* condor-users-bounces@xxxxxxxxxxx 
> [mailto:condor-users-bounces@xxxxxxxxxxx] *On Behalf Of *Robert Hecker
> *Sent:* Wednesday, December 17, 2008 7:51 AM
> *To:* Condor-Users Mail List
> *Subject:* Re: [Condor-users] Windows Condor problems with credd and 
> executing jobs as submitting user
> 
>  
> 
> 
> Hi Greg,
> 
> Thank you very much for the detailed Information inside the pdf File.
> I had realy the problem that the condor _status -f .... command
> listed my job_execution machine, and so i updated the pool passwords.
> 
> Anyway it's not working now.
> 
> If i use the condor_q -analyze command I will get the following
output:
>
========================================================================
==
> -- Submitter: submitter.test.mydomain.com : <172.20.201.19:1078> : 
> submitter.test.
> mydomain.com
>  ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
> ---
> 016.000:  Run analysis summary.  Of 2 machines,
>       2 are rejected by your job's requirements
>       0 reject your job because of their own requirements
>       0 match but are serving users with a better priority in the pool
>       0 match but reject the job for unknown reasons
>       0 match but will not currently preempt their existing job
>       0 are available to run your job
>         No successful match recorded.
>         Last failed match: Wed Dec 17 13:35:58 2008
>         Reason for last match failure: no match found
> 
> WARNING:  Be advised:
>    No resources matched request's constraints
>    Check the Requirements expression below:
> 
> Requirements = (Arch == "INTEL") && (OpSys == "WINNT51") && (Disk >= 
> DiskUsage)
> && ((Memory * 1024) >= ImageSize) && (HasFileTransfer) && 
> (HasWindowsRunAsOwner
> && (LocalCredd =?= "jobcontroller:9620"))
> 
> 
> 1 jobs; 1 idle, 0 running, 0 held
>
========================================================================
==== 
> 
> 
> When I updated the pool password on the execution machine, i
recognized, 
> that
> I'm able to set the password
> 
>
========================================================================
===== 
> 
> C:\condor\bin>condor_store_cred -c add
> Account: condor_pool@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
> 
> Enter password:
> 
> Operation succeeded.
>
========================================================================
====== 
> 
> 
> but if i want to query the password, it will not work:
>
========================================================================
====== 
> 
> C:\condor\bin>condor_store_cred -c query
> Account: condor_pool@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
> 
> Operation failed.
>     Make sure your HOSTALLOW_WRITE setting includes this host.
>
========================================================================
====== 
> 
> 
> My Question now is, Where must the settings HOSTALLOW_WRITE done?
> 
> On the Jobexecutor, Submitter or Controller ?
> 
> I made the settings on all 3 machines like following:
> 
> HOSTALLOW_WRITE = *.test.mydomain.com
> 
> but it is not working. (I also restarted all 3 machines!)
> 
> Any help is welcome.
> 
> Robert
> 
> 
> 
> *Greg Quinn <gquinn@xxxxxxxxxxx>*
> Gesendet von: condor-users-bounces@xxxxxxxxxxx
> 
> 16.12.2008 17:07
> 
> Bitte antworten an
> Condor-Users Mail List <condor-users@xxxxxxxxxxx>
> 
> 	
> 
> An
> 
> 	
> 
> Condor-Users Mail List <condor-users@xxxxxxxxxxx>
> 
> Kopie
> 
> 	
> 
> Thema
> 
> 	
> 
> Re: [Condor-users] Windows Condor problems with credd and executing
jobs 
> as submitting user
> 
>  
> 
> 	
> 
> 
> 
> 
> Hi all,
> 
> Many discussions on this list surround folks having trouble getting
the
> Windows "run_as_owner" feature working by setting up a CredD. I have
> just finished rewriting the related section of our manual to give more
> of a self-contained HOWTO style introduction. I'm hoping it will make
it
> easier for people to get the CredD set up in the future.
> 
> The new text will be in the some-to-be-released 7.2.0 version of the
> manual, but in the mean time I've placed a 2-page PDF with the
relevant
> section (6.2.4) here:
> 
> http://pages.cs.wisc.edu/~gquinn/run_as_owner.pdf
> 
> Please, check it out if you're a user of Condor on Windows. I'm happy
to
> incorporate suggestions.
> 
> Thanks!
> 
> Greg Quinn
> Condor Team
> 
>>  Hello everybody,
>>
>>  I want to use condor to get the Power of the
HighThroughputComputing.
>>  But it seems very hard to get Condor running.
>>
>>  Actually all Condor machines are installed, I can submit jobs, but
the 
> jobs will never be
>>  executed. I think it depends on an wrong configuration because i
want 
> to use network access
>>  and try to run the jobs under the submitted user.
>>
>>  I want to use condor in a windows domain, and I started to set up 
> following machines:
>>         -1 condor controller machine
>>         -1 condor submitter machine
>>         -1 condor execution machine
>>
>>  I use condor version 7.0.5.
>>  I want to use run the jobs under an "real" user account, to get
access 
> to special network files on an
>>  File Server.
>>
>>  I used the help from site 
> http://ben.versionzero.org/wiki/Condor_Authentication
>>  and the Presentation called "quinn_windows_tutorial.ppt" to get the 
> condor setup working, but without
>>  success.
>>
>>  Have someone a idea, what's going wrong here ?
>>  Where can I look next to get more information, to find the mistake?
>>
>>  When i installed condor, i put on every machine the pool password, 
> with the commands
>>
>>  condor_store_cred add -c -n executionmachine.test.mydomain.com
>>  condor_store_cred add -c -n submitmachine.test.mydomain.com
>>  condor_store_cred add -c -n controller.test.mydomain.com
>>
>>  I Used here the password "xyz" which is no domain password.
>>
>>  after that i was on the submit machine and typed
>>
>>  "condor_store_cred add" where condor ask after an Passsword for
User@test
>>  I typed in my password, and that was all. (This password was my
domian 
> password)
>>
>>  After that i submitted my job.sub File which was tested on an
default 
> Condor installation
>>  (without execute as submit user)(this worked...)
>>
>>  job.sub:
>>  ========
>>
>>  Universe   = vanilla
>>  Executable = job.bat
>>  Arguments  = 4 12
>>  Log        = simple.log.txt
>>  Output     = simple.out.txt
>>  Error      = simple.err.txt
>>
>>  run_as_owner = true
>>
>>  Queue
>>
>>
>>
>>  But nothing happend. This means, when i check the status with
condor_q
>>  i will see the job in the queue, but they will be idle.
>>
>>  Did I made some configuration wrong?
>>  Or did I set up some passwords wrong?
>>
>>  It would be great, if someone has an idea, what i have to to to get 
> condor running.
>>
>>  Thank you very much for your help.
>>  Every advice would be helpfull.
>>
>>  Robert
>>   
>>
>>
>>
>>
>>  Here are my configurations:
>>
>>  The condor_config File of the Controller has following changes to
the 
> original:
>>  ========================================================
>>
>>  LOCAL_CONFIG_FILE = $(LOCAL_DIR)/condor_config.local \
>>                      $(LOCAL_DIR)/condor_config.local.credd
>>
>>  HOSTALLOW_CONFIG = Submitmachine.test.mydomain.com
>>
>>  And the condor_config.local.credd of the Controller looks like this:
>>  ================================================
>>
######################################################################
>>  ##
>>  ##  condor_config.credd
>>  ##
>>  ##  This is the default local configuration file for the machine
>>  ##  running the condor_credd.  You should copy this file to the
>>  ##  appropriate location and customize it for your needs.
>>  ##
>>
######################################################################
>>
>>  ## Note: The following settings will need to be present in your
>>  ## global config file:
>>  ##
>>  ##   CREDD_HOST = my-credd.cs.wisc.edu
>>  ##   STARTER_ALLOW_RUNAS_OWNER = True
>>  ##   CREDD_CACHE_LOCALLY = True
>>  ##
>>  ## You'll also need to ensure that clients are configured to use
>>  ## PASSWORD authentication on any machine that can run jobs as the
>>  ## submitting user. For example,
>>  ##
>>  ##   SEC_CLIENT_AUTHENTICATION_METHODS = NTSSPI, PASSWORD
>>
>>  ## CREDD_SETTINGS
>>
>>  ## CREDD logging settings
>>  ## Customize these if you wish.
>>  CREDD_LOG = $(LOG)/CreddLog
>>  CREDD_DEBUG = D_COMMAND
>>  MAX_CREDD_LOG = 50000000
>>
>>  #################################################
>>  ## CREDD Expert settings
>>  ## Everyting below is for the UBER-KNOWLEDGEABLE only!
>>  ## Do not change these unless you know what you do!
>>  #################################################
>>
>>
>>  DAEMON_LIST = $(DAEMON_LIST), CREDD
>>  #DC_DAEMON_LIST = \
>>  #MASTER, STARTD, SCHEDD, KBDD, COLLECTOR, NEGOTIATOR, EVENTD, \
>>  #VIEW_SERVER, CONDOR_VIEW, VIEW_COLLECTOR, HAWKEYE, CREDD, HAD, \
>>  #QUILL
>>
>>  CREDD    = $(SBIN)/condor_credd.exe
>>
>>  # Timeout session quickly since we normally only get contacted
>>  # once per starter
>>  SEC_CREDD_SESSION_TIMEOUT = 10
>>
>>
>>  # Set security settings so that full security to the credd is
required
>>  CREDD.SEC_DEFAULT_AUTHENTICATION =REQUIRED
>>  CREDD.SEC_DEFAULT_ENCRYPTION = REQUIRED
>>  CREDD.SEC_DEFAULT_INTEGRITY = REQUIRED
>>  CREDD.SEC_DEFAULT_NEGOTIATION = REQUIRED
>>
>>  # Require PASSWORD auth for password fetching
>>  CREDD.SEC_DAEMON_AUTHENTICATION_METHODS = PASSWORD
>>
>>  # Only honor password fetch requests to the trusted "condor_pool"
user
>>  CREDD.ALLOW_DAEMON = condor_pool@$(UID_DOMAIN)
>>
>>  # Require NTSSPI for storing credentials
>>  CREDD.SEC_DEFAULT_AUTHENTICATION_METHODS = NTSSPI
>>
>>  The Submit machine has following condor_config:
>>  ====================================
>>  LOCAL_CONFIG_FILE = $(LOCAL_DIR)/condor_config.local \
>>                      $(LOCAL_DIR)/condor_config.local.submit.execute
>>
>>  HOSTALLOW_CONFIG = Submitmachine.test.mydomain.com
>>
>>  CREDD_HOST  = $(CONDOR_HOST):$(CREDD_PORT)
>>
>>  The file condor_config.local.submit.execute File from the Submit 
> machine looks like:
>>  =============================================================
>>
>>
######################################################################
>>  ##
>>  ##  condor_config.local.submit.execute
>>  ##
>>  ##  This is the default local configuration file for the submit
machine
>>  ##  and execute machine.
>>  ##
>>
######################################################################
>>
>>  ## Note: The following settings will need to be present in your
>>  ## global config file:
>>  STARTER_ALLOW_RUNAS_OWNER = True
>>  CREDD_CACHE_LOCALLY = True
>>  ##
>>  ## You'll also need to ensure that clients are configured to use
>>  ## PASSWORD authentication on any machine that can run jobs as the
>>  ## submitting user. For example,
>>  ##
>>  SEC_CLIENT_AUTHENTICATION_METHODS = NTSSPI, PASSWORD
>>
>>  And the condor_config File from the Execution machine looks like:
>>  =================================================
>>
>>  LOCAL_CONFIG_FILE = $(LOCAL_DIR)/condor_config.local \
>>                      $(LOCAL_DIR)/condor_config.local.submit.execute
>>
>>  HOSTALLOW_CONFIG = Submitmachine.test.mydomain.com
>>
>>  CREDD_HOST  = $(CONDOR_HOST):$(CREDD_PORT)
>>
>>  And the condor_config.local.submit.execute File from the
>>  Execution machine is the same file like this one from the
Submitmachine.
> 
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx
with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/
> 
> 
>
------------------------------------------------------------------------
> 
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx
with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at: 
> https://lists.cs.wisc.edu/archive/condor-users/
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with
a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: 
https://lists.cs.wisc.edu/archive/condor-users/