[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Assistance Configuring Authentication



Yes, for some reason my setup is not following the default behavior.

 

 

condor_submit -schedd does show all schedds.

 

 

I ran condor_submit -schedd -autoformat Name MyAddress and tried using the name and address of a schedd in condor_submit -name but the job was still submitted to another schedd.

 

 

Below is an example of condor_q:

 

 

seamew:condor coulter$ condor_q

 

 

-- Schedd: starlifter.xxx.xxx.xxx : <xxx.xxx.xxx.xxx:64271?... @ 02/05/20 14:50:30

OWNER   BATCH_NAME    SUBMITTED   DONE   RUN    IDLE   HOLD  TOTAL JOB_IDS

coulter ID: 1310     2/5  10:48      _      _      _      1      1 1310.0

 

Total for query: 1 jobs; 0 completed, 0 removed, 0 idle, 0 running, 1 held, 0 suspended

Total for coulter: 1 jobs; 0 completed, 0 removed, 0 idle, 0 running, 1 held, 0 suspended

Total for all users: 15 jobs; 0 completed, 0 removed, 0 idle, 0 running, 15 held, 0 suspended

 

seamew:condor coulter$

seamew:condor coulter$ condor_rm coulter

All jobs of user "coulter" have been marked for removal

seamew:condor coulter$ condor_q

 

 

-- Schedd: canberra.xxx.xxx.xxx : <xxx.xxx.xxx.xxx:49241?... @ 02/05/20 14:48:16

OWNER BATCH_NAME      SUBMITTED   DONE   RUN    IDLE   HOLD  TOTAL JOB_IDS

 

Total for query: 0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended

Total for coulter: 0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended

Total for all users: 14 jobs; 0 completed, 0 removed, 0 idle, 0 running, 14 held, 0 suspended

 

As you can see, I am running the condor_q from seamew, but the first result shows the schedd located on starlifter and after running condor_rm the schedd changes to canberra.

 

I'm not sure how to set the environment variable _CONDOR_TOOL_DEBUG=D_COMMAND _condor_submit -debug -name <submitfile>  I tried placing it in the submit file:

 

# Unix submit description file

# sleep.sub -- simple sleep job

 

environment        = _CONDOR_TOOL_DEBUG=D_COMMAND _condor_submit -debug -name /net/home/coulter/sleep.sub

executable              = sleep.sh

log                     = /net/home/coulter/condor/sleep.log

output                  = /net/home/coulter/condor/sleep.out

error                   = /net/home/coulter/condor/sleep.error

should_transfer_files   = Yes

when_to_transfer_output = ON_EXIT

 

queue

 

but I don't see any additional output in sleep.log so I don't think I'm doing that right.

 

Here's the local config file all workstations with schedds share:

 

CONDOR_HOST=macdaddy.xxx.xxx.xxx

DAEMON_LIST=MASTER STARTD SCHEDD

ALLOW_READ=*

ALLOW_WRITE=*

ALLOW_NEGOTIATOR = macdaddy.xxx.xxx.xxx

CONDOR_IDS = 3055.8186

CONDOR_ADMIN = root@$(FULL_HOSTNAME)

ALL_DEBUG = D_FULLDEBUG

 

SCHEDD_HOST=$(FULL_HOSTNAME)

 

USE_SHARED_PORT=FALSE

 

SEC_DEFAULT_AUTHENTICATION = OPTIONAL

SEC_DEFAULT_AUTHENTICATION_METHODS = CLAIMTOBE

 

Thanks again for any suggestions you may have.

 

Jim

 

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of John M Knoeller
Sent: Monday, February 3, 2020 11:32 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [Non-DoD Source] Re: [HTCondor-users] Assistance Configuring Authentication

 

That’s a bit surprising, the *default* behavior of condor_submit is to submit to the local schedd.  configuring SCHEDD_HOST is normally

something you do when you want to have it default to a non-local schedd instead.

 

What does

 

condor_status -schedd

 

show? 

It should be showing all of your schedds.

 

now try

 

condor_status -schedd -autoformat Name MyAddress

 

that should show the names and addresses of all of your schedds.

 

When you run “condor_submit -name” using the Name field from the condor_status output above, that should submit to the schedd of that name.

It does that by doing the equivalent of that condor_status query. 

 

You can add -debug to condor_submit, and set _CONDOR_TOOL_DEBUG environment variable to see a log of the communication, like this

 

_CONDOR_TOOL_DEBUG=D_COMMAND _condor_submit -debug -name <submitfile>

 

where <submitfile> is the name of your submit file.   This will log what schedd it is actually talking to.  Look for QMGMT_WRITE_CMD,  that is the actual submit.   There will likely be a command before that where it queries the collector to get the address of the schedd.

 

-tj

 

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of COULTER, JAMES A CTR USAF AFMC 96 SK/CCI via HTCondor-users
Sent: Friday, January 31, 2020 1:28 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Cc: COULTER, JAMES A CTR USAF AFMC 96 SK/CCI <james.coulter.2.ctr@xxxxxxxxx>
Subject: Re: [HTCondor-users] Assistance Configuring Authentication

 

Thanks that’s very helpful.  I am still having problems getting condor_submit to send jobs to the schedd running on the workstation it was submitted from.  I have tried setting SCHEDD_HOST in the local config file as well as trying condor_submit –name <schedd_name>.  No errors are reported but jobs are still submitted to one schedd only. 

 

Any suggestions on what else I can try to force condor_submit to send the job to it’s own schedd and not another workstations?

 

Thank you!

 

Jim

 

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of John M Knoeller
Sent: Thursday, January 30, 2020 2:59 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [Non-DoD Source] Re: [HTCondor-users] Assistance Configuring Authentication

 

condor_q -all

 

will show you the name of the local/default schedd (it’s at this top of the output)

 

condor_status -schedd

 

will show you the names of all of the schedd’s in your pool

 

-tj

 

 

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of COULTER, JAMES A CTR USAF AFMC 96 SK/CCI via HTCondor-users
Sent: Thursday, January 30, 2020 8:57 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Cc: COULTER, JAMES A CTR USAF AFMC 96 SK/CCI <james.coulter.2.ctr@xxxxxxxxx>
Subject: Re: [HTCondor-users] Assistance Configuring Authentication

 

Thanks for reply.  SCHEDD_HOST is undefined.  The documentation says it is set with name@hostname, but for the life of me I can't figure out what the schedd daemons' names are .  I ran condor_submit -verbose, but I can't find a name attribute in the output.  Can you tell me how to determine the name of the schedd?

 

Thanks,

 

Jim

 


From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of John M Knoeller <johnkn@xxxxxxxxxxx>
Sent: Wednesday, January 29, 2020 3:15 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [Non-DoD Source] Re: [HTCondor-users] Assistance Configuring Authentication

 

condor_submit  decides what schedd to send jobs to by looking at

  

   the -remote option passed on the command line

   the configured value of SCHEDD_HOST

   looking for a local schedd (the configuration variable SCHEDD_ADDRESS_FILE tells it where to look)

 

If you are talking to a non-local schedd unexpectedly, you should check your configuration for SCHEDD_HOST by running

 

   condor_config_val -verbose SCHEDD_HOST

 

I should warn you that CLAIMTOBE is *very* dangerous in this context.   CLAIMTOBE “authentication” allows any user to impersonate any other user.

and since condor_submit can ask the schedd to run arbitrary code, CLAIMTOBE would let any user run code as any other user.

 

-tj

 

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of COULTER, JAMES A CTR USAF AFMC 96 SK/CCI via HTCondor-users
Sent: Wednesday, January 29, 2020 2:26 PM
To: htcondor-users@xxxxxxxxxxx
Cc: COULTER, JAMES A CTR USAF AFMC 96 SK/CCI <james.coulter.2.ctr@xxxxxxxxx>
Subject: Re: [HTCondor-users] Assistance Configuring Authentication

 

Hi Collin,

 

Thanks for the reply.  FS_REMOTE_DIR does not seem to be recognized.  Using CLAIMTOBE does allow one workstation to submit it's job to the other workstation's schedd. I can now submit jobs from both workstations.

 

I am curious how condor_submit determines which schedd to send a job to.  Or is that maybe the negotiator on the master?  I think the preferable way (at least for us) would be for condor_submit to send the job to the schedd on the workstation the job was sent from.

 

Thanks again!

 

Jim

 

I do not know how condor_submit determines which schedd to
>Hi Jim,
>

>If the FS issue is that /tmp isn't available to the Schedd you can change
>where it writes the test file with FS_REMOTE_DIR.
>
>If you want to test the rest of the configuration you can try temporarily
>using the CLAIMTOBE method, which trusts the client and is insecure.
>Something like:
>
>SEC_DEFAULT_AUTHENTICATION = OPTIONAL
>SEC_DEFAULT_AUTHENTICATION_METHODS = CLAIMTOBE
>
>I don't have experience with PASSWORD authentication, so hopefully someone
>more knowledgeable can help you troubleshoot that.
>
>Best,
>Collin


On Wed, Jan 29, 2020 at 9:22 AM COULTER, JAMES A CTR USAF AFMC 96 SK/CCI
via HTCondor-users <htcondor-users@xxxxxxxxxxx> wrote:

> Hi,
>
>
> I have a requirement to configure HTCondor to submit jobs to Mac Sierra
> workstations.  So far I have installed a condor master running Master,
> Collector, and Negotiator daemons on a RHEL 7 server.  I have installed
> htcondor on two Mac Sierra workstations both running startd and schedd
> daemons.  We created a condor user and home on our NFS file system that all
> machines can access.
>
>
> The problem I'm having is the workstations are both submitting jobs to the
> same schedd.  Sometimes its to workstation A, sometimes to workstation B.
> If workstation A submits its job to B's schedd (and vice versa) I get an
> authentication error.
>
>
> I have tried several different authentication methods, but I can't get any
> to work.  If I leave SEC_DEFAULT_AUTHENTICATION = OPTIONAL, I get a
> Kerberos authentication failed error.
>
>
> Right now I am trying the pool password configuration I found in the FAQ:
> https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=HowToEnablePoolPassword
> This setup results in FS authentication failing when workstation A submits
> its job to schedd on workstation B.
>
>
> Workstations A and B are using the same condor_config.local file.  Here
> are the contents:
>
>
> CONDOR_HOST=master.example.com
> DAEMON_LIST=MASTER STARTD SCHEDD
> ALLOW_READ=*
> ALLOW_WRITE=*
> ALLOW_NEGOTIATOR = master.example.com
> CONDOR_IDS = 3055.8186
> CONDOR_ADMIN = root@$(FULL_HOSTNAME)
> ALL_DEBUG = D_FULLDEBUG
>
> #
> # From
> https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=HowToEnablePoolPassword
> #
> SEC_PASSWORD_FILE = /etc/condor/condor_pool_password (NOTE: this file was
> created on the master and copied to both clients, owner root, mode 0600)
> SEC_DAEMON_INTEGRITY = REQUIRED
> SEC_DAEMON_AUTHENTICATION = REQUIRED
> SEC_DAEMON_AUTHENTICATION_METHODS = PASSWORD
> SEC_NEGOTIATOR_INTEGRITY = REQUIRED
> SEC_NEGOTIATOR_AUTHENTICATION = REQUIRED
> SEC_NEGOTIATOR_AUTHENTICATION_METHODS = PASSWORD
> SEC_CLIENT_AUTHENTICATION_METHODS = FS,PASSWORD
> ALLOW_DAEMON = condor_pool@*
>
> -----------------------------------------------------------------------------------
>
>
>
> Here's the errors found in SchedLog after workstation A tries to submit a
> job to schedd on workstation B:
>
>
> 01/29/20 10:28:56 (pid:14019) DC_AUTHENTICATE: authentication of
> <xxx.xxx.xxx.117:58966> did not result in a valid mapped user name, which
> is required for this command (1112 QMGMT_WRITE_CMD), so aborting.
> 01/29/20 10:28:56 (pid:14019) DC_AUTHENTICATE: reason for authentication
> failure: AUTHENTICATE:1003:Failed to authenticate with any
> method|AUTHENTICATE:1004:Failed to authenticate using FS|FS:1004:Unable to
> lstat(/tmp/FS_Ox5tu50VK)
>
>
> ---------------------------------------------------------------------------------
>
> This is what failure on a client looks like:
>
> coulter@albatross ~/condor>/opt/condor/bin/condor_submit -debug sleep.sub
> 01/29/20 10:28:59 Reading condor configuration from
> '/etc/condor/condor_config'
> 01/29/20 10:28:59 Enumerating interfaces: lo0 127.0.0.1 up
> 01/29/20 10:28:59 Enumerating interfaces: lo0 ::1 up
> 01/29/20 10:28:59 Enumerating interfaces: lo0 fe80::1 up
> 01/29/20 10:28:59 Enumerating interfaces: en0 xxx.xxx.xxx.117 up
> Submitting job(s)01/29/20 10:28:59 SharedPortClient: sent connection
> request to schedd at <xxx.xxx.xxx.235:9618> for shared port id 1964_6748_6
> 01/29/20 10:28:59 SECMAN: required authentication with schedd at
> <xxx.xxx.xxx.235:9618> failed, so aborting command QMGMT_WRITE_CMD.
>
> ERROR: Failed to connect to local queue manager
> AUTHENTICATE:1003:Failed to authenticate with any method
> AUTHENTICATE:1004:Failed to authenticate using FS
>
>
> ---------------------------------------------------------------------------------
>
> The way I read this is FS authentication is attempting to read a file on
> the schedd's local file system but because it isn't the submitter's local
> file system it fails.  I don't see anything at all about Password
> authentication.  I tried setting SEC_CLIENT_AUTHENTICATION_METHODS =
> PASSWORD but that results in AUTHENTICATE:1003:Failed to authenticate
> with any method.
>
>
> Any suggestions on what I can do? My customer has a grand total of 20 Mac
> Sierra workstations they want in the pool and we are on a dedicated network
> so security is not as high on the priority list as getting this working.
>
>
> Thanks,
>
>
> Jim
>
>
>
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with
> a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/



--
*Collin Mehring *| PE-JoSE - Software Engineer
-------------- next part ---------

 

Attachment: smime.p7s
Description: S/MIME cryptographic signature