[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Fwd: Aw: Re: Cannot sent jobs as Owner in WindowsOS





Von: rb
Datum: 19. September 2018 um 11:02
An: "Todd Tannenbaum"
Betreff: Aw: Re: [HTCondor-users] Cannot sent jobs as Owner in WindowsOS


Hello Todd,

thanks for the additional hints.
I was able to move a bit forward, but was not yet successful.
Eg I was able to specify a condor-pool PW. Jobs are now picked up by condor, however non of them are picked by the nodes as it seems the requirements are not matching.
(Remark: Jobs are matching and running when using the default temp user from condor)


I attach the condor config files I created now. One for master, one submitter, one node.
The submission files contain a line: "Run_as_owner = true"

a) Basically I copied the content of the ..\etc\condor_config.local.credd into the condor config file of the pool manager running CREDD
b) copied
CREDD_HOST = credd.cs.wisc.edu
CREDD_CACHE_LOCALLY = True

STARTER_ALLOW_RUNAS_OWNER = True

ALLOW_CONFIG = Administrator@*
SEC_CLIENT_AUTHENTICATION_METHODS = NTSSPI, PASSWORD
SEC_CONFIG_NEGOTIATION = REQUIRED
SEC_CONFIG_AUTHENTICATION = REQUIRED
SEC_CONFIG_ENCRYPTION = REQUIRED
SEC_CONFIG_INTEGRITY = REQUIRED
into all processing and submitter machines.
 

When now running jobs they are stucked in the queue.
Running condor_q -analyze is giving the following message:
 
WARNING:  Be advised:
   No resources matched request's constraints
The Requirements _expression_ for your job is:
    ( ( ( OpSys == "WINNT51" || OpSys == "WINNT52" || OpSys == "WINNT60" ||
          OpSys == "WINNT61" ) || ( ( OpSys == "WINDOWS" ||
            OpSys == "LINUX" ) && Arch == "X86_64" ) ) ) &&
    ( TARGET.Disk >= RequestDisk ) && ( TARGET.Memory >= RequestMemory ) &&
    ( TARGET.HasFileTransfer ) && ( TARGET.HasWindowsRunAsOwner &&
      ( TARGET.LocalCredd is "AHERSRVBLD28.lgs-net.com" ) )

Suggestions:
    Condition                         Machines Matched    Suggestion
    ---------                         ----------------    ----------
1   ( ( ( OpSys == "WINNT51" || OpSys == "WINNT52" || OpSys == "WINNT60" || OpSys == "WINNT61" ) || ( ( OpSys == "WINDOWS" || OpSys == "LINUX" ) && Arch == "X86_64" ) ) )
                                      0                   REMOVE
2   ( TARGET.HasWindowsRunAsOwner && ( TARGET.LocalCredd is "AHERSRVBLD28.lgs-net.com" ) )
                                      0                   REMOVE
3   ( TARGET.Disk >= 3 )              18
4   ( TARGET.Memory >= ifthenelse(MemoryUsage isnt undefined,MemoryUsage,0) )
                                      18
5   ( TARGET.HasFileTransfer )        18
---
7163.000:  Request is running.






Some questions:

-Would this depend on the version of condor? I am running 8.4.10 on all machines?

-My user is known in the domain. Would I need to add this user to the local users of each processing machine?

-In the user manual in 7.2.5 "Condor_credd Daemon" a variable called "Local_credd" is mentioned. However I cannot find this variable in non of the examples. Is it necessary to specify this variable in the config file?

- Do I need to use a pool PW? Or is it enought to use suggestion from "7.2.6 Executing Jobs with the User's Profile Loaded" and just set "load_profile = True" in submission file.

- In usermanual 3.8.13.2 I find the following sentence: "Under Windows, HTCondor by default runs jobs under a dynamically created local account that exists for the duration of the job, but it can optionally run the job as the user account that owns the job if STARTER_ALLOW_RUNAS_OWNER is True and the job contains RunAsOwner=True."
Is it RunAsOwner = true or Run_As_Owner = true?

 
Btw:
whoami is giving: calibration@xxxxxxxxxxx.
This is correct. I would like to have this user running jobs in the condor environment.


Best regards,
Robert



-----------------------

-----------------------


> Gesendet: Donnerstag, 13. September 2018 um 22:31 Uhr
> Von: "Todd Tannenbaum" <tannenba@xxxxxxxxxxx>
> An: "HTCondor-Users Mail List" <htcondor-users@xxxxxxxxxxx>, rb <robertbosch@xxxxxx>
> Betreff: Re: [HTCondor-users] Cannot sent jobs as Owner in WindowsOS
>
> On 9/12/2018 5:02 AM, rb wrote:
> > I would like to send and process the job as "owner".
> > Not the default "condor-slot user" is procesing the job, but actually the person who is logged on the submitter and is sending the job.
> >
> > For this we created a user "calibration*. This user is registered in our domain and has admin-permission on all machines (All win 10) connected to the pool.
> >
> > For this I edited the config file on Submitter and Executing nodes:
> >
> > [...]
> > FILESYSTEM_DOMAIN = lgs-net.com
> > UID_DOMAIN = lgs-net.com
> > TRUST_UID_DOMAIN = true
> > SOFT_UID_DOMAIN = true
> > STARTER_ALLOW_RUNAS_OWNER = true
> > [...]
> >
> >
> > The submission files are having in addition following entry
> > [...]
> > Run_As_Owner = true
> > [...]
> >
> >
> > I also used "condor_store_cred add" on submitter and pool to store PW for user "calibration"
> >
> > Still its not working!
> > Jobs are created. Also .err and .out files. But they are not picked by Scheduler. Using "condor_q": No jobs in queue.
> >
> >
> > Can someone give some hints?
> >
>
> Did you do a condor_reconfig or restart HTCondor after changing the config settings on your execute and submit hosts?
>
> Also I don't see anything in your config re your CREDD_HOST etc, as described in the Microsoft Windows chapter in the HTCondor Manual for executing jobs as the Submitting User... specifically I am looking at this section:
> http://htcondor.org/manual/v8.7/MicrosoftWindows.html#x75-5750008.2.4
> Perhaps you want to re-read and follow the configuration examples in that part of the Manual.
>
> Some additional ideas / suggestions:
>
> Are you running condor_submit as user "calibration" ? What does "whoami" report before submitting the job?
>
> Try submitting a very simple job and see if that runs as user "calibration". I would suggest running "whoami.exe" with a job event log and see what happens. For example --
> executable = whoami.exe
> output = test.out
> error = test.err
> log = test.log
> run_as_owner = true
> queue
>
> and then take a look at test.out, test.err, test.log.
>
> You say the job is successfully submitted but condor_q says no jobs in the queue... ??? what does "condor_q -allusers" say? Or is that because the job is quickly completing... what does condor_history say?
>
> Re the below observations: I am not the Windows expert, but I believe you should only need to run 'condor_store_cred add' on the submit node, which will then send the password (encrypted) and securely store it on the host running the condor_credd daemons. The execute node will securely fetch the password as needed.
>
> Hope the above helps,
> Todd
>
>
> > I made two observations:
> > 1) I cannot use "condor_store_cred add" on executing machines. It returns an error "operation failed". Make sure you have WRITE permission onto this node. Although "WRITE = *" is set in all config files.
> > 2) By default our Software adds "load_profile = true" in all submission files. Could this be a potential problem?
> >
> >
> >
> > Best regards,
> > Robert
> >
> >
> >
> >
> >
> >
> >
> > -----------------------
> >
> > -----------------------
> >
> > _______________________________________________
> > HTCondor-users mailing list
> > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> > subject: Unsubscribe
> > You can also unsubscribe by visiting
> > https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> >
> > The archives can be found at:
> > https://lists.cs.wisc.edu/archive/htcondor-users/
> >
>
>
> --
> Todd Tannenbaum <tannenba@xxxxxxxxxxx> University of Wisconsin-Madison
> Center for High Throughput Computing Department of Computer Sciences
> HTCondor Technical Lead 1210 W. Dayton St. Rm #4257
> Phone: (608) 263-7132 Madison, WI 53706-1685
>
######################################################################
##
##  condor_config
##
##  This is the global configuration file for condor. This is where
##  you define where the local config file is. Any settings
##  made here may potentially be overridden in the local configuration
##  file.  KEEP THAT IN MIND!  To double-check that a variable is
##  getting set from the configuration file that you expect, use
##  condor_config_val -v <variable name>
##
##  condor_config.annotated is a more detailed sample config file
##
##  Unless otherwise specified, settings that are commented out show
##  the defaults that are used if you don't define a value.  Settings
##  that are defined here MUST BE DEFINED since they have no default
##  value.
##
######################################################################

##  Where have you installed the bin, sbin and lib condor directories?   
RELEASE_DIR = C:\condor

##  Where is the local condor directory for each host?  This is where the local config file(s), logs and
##  spool/execute directories are located. this is the default for Linux and Unix systems.
#LOCAL_DIR = $(TILDE)
##  this is the default on Windows sytems
#LOCAL_DIR = $(RELEASE_DIR)

##  Where is the machine-specific local config file for each host?
LOCAL_CONFIG_FILE = $(LOCAL_DIR)\condor_config.local
##  If your configuration is on a shared file system, then this might be a better default
#LOCAL_CONFIG_FILE = $(RELEASE_DIR)\etc\$(HOSTNAME).local
##  If the local config file is not present, is it an error? (WARNING: This is a potential security issue.)
REQUIRE_LOCAL_CONFIG_FILE = FALSE

##  The normal way to do configuration with RPMs is to read all of the
##  files in a given directory that don't match a regex as configuration files.
##  Config files are read in lexicographic order.
LOCAL_CONFIG_DIR = $(LOCAL_DIR)\config
#LOCAL_CONFIG_DIR_EXCLUDE_REGEXP = ^((\..*)|(.*~)|(#.*)|(.*\.rpmsave)|(.*\.rpmnew))$

##  Use a host-based security policy. By default CONDOR_HOST and the local machine will be allowed
use SECURITY : HOST_BASED
##  To expand your condor pool beyond a single host, set ALLOW_WRITE to match all of the hosts
#ALLOW_WRITE = *.cs.wisc.edu
##  FLOCK_FROM defines the machines that grant access to your pool via flocking. (i.e. these machines can join your pool).
#FLOCK_FROM =
##  FLOCK_TO defines the central managers that your schedd will advertise itself to (i.e. these pools will give matches to your schedd).
#FLOCK_TO = condor.cs.wisc.edu, cm.example.edu

##--------------------------------------------------------------------
## Values set by the condor_configure script:
##--------------------------------------------------------------------

CONDOR_HOST = 194.11.95.125
UID_DOMAIN = lgs-net.com
CONDOR_ADMIN = Calibration@xxxxxxxxxxx

SMTP_SERVER = 

ALLOW_READ = *

ALLOW_WRITE = *

ALLOW_ADMINISTRATOR = *

JAVA = C:\PROGRA~1\Java\JRE18~2.0_1\bin\java.exe

use POLICY : ALWAYS_RUN_JOBS
WANT_VACATE = FALSE

WANT_SUSPEND = TRUE

DAEMON_LIST = MASTER STARTD

NUM_SLOTS = $(detected_Memory)/16000



FILESYSTEM_DOMAIN = lgs-net.com
TRUST_UID_DOMAIN = true 
SOFT_UID_DOMAIN = true

STARTER_ALLOW_RUNAS_OWNER = true
CREDD_HOST = AHERSRVBLD28.lgs-net.com 
CREDD_CACHE_LOCALLY = True   
 
ALLOW_CONFIG = *  
SEC_CLIENT_AUTHENTICATION_METHODS = NTSSPI, PASSWORD  
SEC_CONFIG_NEGOTIATION = REQUIRED  
SEC_CONFIG_AUTHENTICATION = REQUIRED  
SEC_CONFIG_ENCRYPTION = REQUIRED  
SEC_CONFIG_INTEGRITY = REQUIRED 
######################################################################
##
##  condor_config
##
##  This is the global configuration file for condor. This is where
##  you define where the local config file is. Any settings
##  made here may potentially be overridden in the local configuration
##  file.  KEEP THAT IN MIND!  To double-check that a variable is
##  getting set from the configuration file that you expect, use
##  condor_config_val -v <variable name>
##
##  condor_config.annotated is a more detailed sample config file
##
##  Unless otherwise specified, settings that are commented out show
##  the defaults that are used if you don't define a value.  Settings
##  that are defined here MUST BE DEFINED since they have no default
##  value.
##
######################################################################

##  Where have you installed the bin, sbin and lib condor directories?   
RELEASE_DIR = C:\condor

##  Where is the local condor directory for each host?  This is where the local config file(s), logs and
##  spool/execute directories are located. this is the default for Linux and Unix systems.
#LOCAL_DIR = $(TILDE)
##  this is the default on Windows sytems
#LOCAL_DIR = $(RELEASE_DIR)

##  Where is the machine-specific local config file for each host?
LOCAL_CONFIG_FILE = $(LOCAL_DIR)\condor_config.local
##  If your configuration is on a shared file system, then this might be a better default
#LOCAL_CONFIG_FILE = $(RELEASE_DIR)\etc\$(HOSTNAME).local
##  If the local config file is not present, is it an error? (WARNING: This is a potential security issue.)
REQUIRE_LOCAL_CONFIG_FILE = FALSE

##  The normal way to do configuration with RPMs is to read all of the
##  files in a given directory that don't match a regex as configuration files.
##  Config files are read in lexicographic order.
LOCAL_CONFIG_DIR = $(LOCAL_DIR)\config
#LOCAL_CONFIG_DIR_EXCLUDE_REGEXP = ^((\..*)|(.*~)|(#.*)|(.*\.rpmsave)|(.*\.rpmnew))$

##  Use a host-based security policy. By default CONDOR_HOST and the local machine will be allowed
use SECURITY : HOST_BASED
##  To expand your condor pool beyond a single host, set ALLOW_WRITE to match all of the hosts
#ALLOW_WRITE = *.cs.wisc.edu
##  FLOCK_FROM defines the machines that grant access to your pool via flocking. (i.e. these machines can join your pool).
#FLOCK_FROM =
##  FLOCK_TO defines the central managers that your schedd will advertise itself to (i.e. these pools will give matches to your schedd).
#FLOCK_TO = condor.cs.wisc.edu, cm.example.edu

##--------------------------------------------------------------------
## Values set by the condor_configure script:
##--------------------------------------------------------------------

CONDOR_HOST = 194.11.95.125
COLLECTOR_NAME = HxMap_IT
UID_DOMAIN = lgs-net.com
CONDOR_ADMIN = Calibration@xxxxxxxxxxx
SMTP_SERVER = 
ALLOW_READ = *
ALLOW_WRITE = *
ALLOW_ADMINISTRATOR = *
START = FALSE
WANT_VACATE = FALSE
WANT_SUSPEND = TRUE
DAEMON_LIST = MASTER SCHEDD COLLECTOR NEGOTIATOR CREDD
NUM_SLOTS_Type1 = 1

FILESYSTEM_DOMAIN = lgs-net.com
TRUST_UID_DOMAIN = true 
SOFT_UID_DOMAIN = true

STARTER_ALLOW_RUNAS_OWNER = true
CREDD_HOST = 194.11.95.125 
CREDD_CACHE_LOCALLY = True 



SEC_CLIENT_AUTHENTICATION_METHODS = NTSSPI, PASSWORD

##
## And finally, you'll need to enable CONFIG-level access for all
## machines in the pool so that the pool password can be stored:
##

ALLOW_CONFIG = *
SEC_CONFIG_NEGOTIATION = REQUIRED
SEC_CONFIG_AUTHENTICATION = REQUIRED
SEC_CONFIG_ENCRYPTION = REQUIRED
SEC_CONFIG_INTEGRITY = REQUIRED
##
## See the "Executing Jobs as the Submitting User" section of the
## Condor manual for further details.

## CREDD_SETTINGS

## CREDD logging settings
## Customize these if you wish.
CREDD_LOG = $(LOG)/CreddLog
CREDD_DEBUG = D_COMMAND
MAX_CREDD_LOG = 50000000







#################################################
## CREDD Expert settings
## Everyting below is for the UBER-KNOWLEDGEABLE only!
## Do not change these unless you know what you do!
#################################################


#DC_DAEMON_LIST = \
#MASTER, STARTD, SCHEDD, KBDD, COLLECTOR, NEGOTIATOR, EVENTD, \
#VIEW_SERVER, CONDOR_VIEW, VIEW_COLLECTOR, HAWKEYE, CREDD, HAD, \
#QUILL

CREDD    = $(SBIN)/condor_credd.exe

# Timeout session quickly since we normally only get contacted
# once per starter
SEC_CREDD_SESSION_TIMEOUT = 10


# Set security settings so that full security to the credd is required
CREDD.SEC_DEFAULT_AUTHENTICATION =REQUIRED
# CREDD.SEC_DEFAULT_ENCRYPTION = REQUIRED
CREDD.SEC_DEFAULT_INTEGRITY = REQUIRED
CREDD.SEC_DEFAULT_NEGOTIATION = REQUIRED 

# Require PASSWORD auth for password fetching
CREDD.SEC_DAEMON_AUTHENTICATION_METHODS = PASSWORD

# Only honor password fetch requests to the trusted "condor_pool" user
CREDD.ALLOW_DAEMON = condor_pool@$(UID_DOMAIN)

# Require NTSSPI for storing credentials
CREDD.SEC_DEFAULT_AUTHENTICATION_METHODS = NTSSPI
######################################################################
##
##  condor_config
##
##  This is the global configuration file for condor. This is where
##  you define where the local config file is. Any settings
##  made here may potentially be overridden in the local configuration
##  file.  KEEP THAT IN MIND!  To double-check that a variable is
##  getting set from the configuration file that you expect, use
##  condor_config_val -v <variable name>
##
##  condor_config.annotated is a more detailed sample config file
##
##  Unless otherwise specified, settings that are commented out show
##  the defaults that are used if you don't define a value.  Settings
##  that are defined here MUST BE DEFINED since they have no default
##  value.
##
######################################################################

##  Where have you installed the bin, sbin and lib condor directories?   
RELEASE_DIR = C:\condor

##  Where is the local condor directory for each host?  This is where the local config file(s), logs and
##  spool/execute directories are located. this is the default for Linux and Unix systems.
#LOCAL_DIR = $(TILDE)
##  this is the default on Windows sytems
#LOCAL_DIR = $(RELEASE_DIR)

##  Where is the machine-specific local config file for each host?
LOCAL_CONFIG_FILE = $(LOCAL_DIR)\condor_config.local
##  If your configuration is on a shared file system, then this might be a better default
#LOCAL_CONFIG_FILE = $(RELEASE_DIR)\etc\$(HOSTNAME).local
##  If the local config file is not present, is it an error? (WARNING: This is a potential security issue.)
REQUIRE_LOCAL_CONFIG_FILE = FALSE

##  The normal way to do configuration with RPMs is to read all of the
##  files in a given directory that don't match a regex as configuration files.
##  Config files are read in lexicographic order.
LOCAL_CONFIG_DIR = $(LOCAL_DIR)\config
#LOCAL_CONFIG_DIR_EXCLUDE_REGEXP = ^((\..*)|(.*~)|(#.*)|(.*\.rpmsave)|(.*\.rpmnew))$

##  Use a host-based security policy. By default CONDOR_HOST and the local machine will be allowed
use SECURITY : HOST_BASED
##  To expand your condor pool beyond a single host, set ALLOW_WRITE to match all of the hosts
#ALLOW_WRITE = *.cs.wisc.edu
##  FLOCK_FROM defines the machines that grant access to your pool via flocking. (i.e. these machines can join your pool).
#FLOCK_FROM =
##  FLOCK_TO defines the central managers that your schedd will advertise itself to (i.e. these pools will give matches to your schedd).
#FLOCK_TO = condor.cs.wisc.edu, cm.example.edu

##--------------------------------------------------------------------
## Values set by the condor_configure script:
##--------------------------------------------------------------------

CONDOR_HOST = AHERSRVBLD28.lgs-net.com
CONDOR_ADMIN = 
SMTP_SERVER = 
ALLOW_READ = *
ALLOW_WRITE = *
ALLOW_ADMINISTRATOR = *
JAVA = 
use POLICY : ALWAYS_RUN_JOBS
WANT_VACATE = FALSE
WANT_SUSPEND = TRUE
DAEMON_LIST = MASTER SCHEDD STARTD
SLOT_TYPE_1 = cpus=100%
NUM_SLOTS_TYPE_1 = 1



Network_interface = 194.11.95.204

FILESYSTEM_DOMAIN = lgs-net.com
UID_DOMAIN = lgs-net.com
TRUST_UID_DOMAIN = true 
SOFT_UID_DOMAIN = true
STARTER_ALLOW_RUNAS_OWNER = true
CREDD_HOST = AHERSRVBLD28.lgs-net.com 
CREDD_CACHE_LOCALLY = True   
 
ALLOW_CONFIG = * 
SEC_CLIENT_AUTHENTICATION_METHODS = NTSSPI, PASSWORD  
SEC_CONFIG_NEGOTIATION = REQUIRED  
SEC_CONFIG_AUTHENTICATION = REQUIRED  
SEC_CONFIG_ENCRYPTION = REQUIRED  
SEC_CONFIG_INTEGRITY = REQUIRED