[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] credmon not starting in 23.5.0 on RHEL8



Hi Christoph,

thanks a lot for your answer.

Previously, I used to only add to the daemon list the CREDD and the CREDMON_OAUTH, and it was working fine.

The reason is actually that my config is later on overwritten by another one, limiting the daemon list to MASTER, SCHEDD and ADSTASH.

Cheers,
ben

On 15/02/2024 17:35, Beyer, Christoph wrote:
Hi,

just my 2cente here shouldn't be SEC_CREDENTIAL_MONITOR in thedaemon list in order to start the cred_monitor ?Â

Best
christoph




--
Christoph Beyer
DESY Hamburg
IT-Department

Notkestr. 85
Building 02b, Room 009
22607 Hamburg

phone:+49-(0)40-8998-2317
mail: christoph.beyer@xxxxxxx


Von: "Jason Patton via HTCondor-users" <htcondor-users@xxxxxxxxxxx>
An: "HTCondor-Users Mail List" <htcondor-users@xxxxxxxxxxx>
CC: "Jason Patton" <jpatton@xxxxxxxxxxx>
Gesendet: Donnerstag, 15. Februar 2024 17:17:43
Betreff: Re: [HTCondor-users] credmon not starting in 23.5.0 on RHEL8

Hi Ben,
On Thu, Feb 15, 2024 at 10:02âAM Benoit Roland <benoit.roland@xxxxxxx> wrote:
. The CREDD and CREDMON_OAUTH are not listed in the DAEMON_LIST.
The output of "condor_config_val DAEMON_LIST" only shows:
MASTER, SCHEDD, ADSTASH


This seems likely to be the source of the problem (though it's strange the creddÂis still running). I agree your config line looks good where you add the CREDDÂand CREDMON_OAUTH to the list, but I wonder if there's another config file "later" that is overriding the DAEMON_LIST.

"condor_config -v DAEMON_LIST" should point you to the file and line where it's last being set.

Jason
Â
In my configuration below [6], the daemons are listed in that way:
DAEMON_LIST = $(DAEMON_LIST), CREDD, CREDMON_OAUTH

. The CREDD shows up in the output of "condor_who -quick", but not the CREDMON_OAUTH:

DSTASH = "WaitForStartup"
ADSTASH_PID = 0
CREDD = "Alive"
CREDD_Addr = "<192.108.45.8:9620>"
CREDD_PID = 2395446
IsReady = false

MASTER = "Alive"
MASTER_Addr = "<192.108.45.8:9618?addrs=192.108.45.8-9618+[2a00-139c-3-2e5-0-21-d2-6c]-9618&alias=c4p-login-dev.gridka.de&noUDP&sock=master_2395386_2283>"
MASTER_PID = 2395386
NumAlive = 4
NumDaemons = 5
NumDead = 0
NumHold = 0
NumHung = 0
NumStartup = 1
SCHEDD = "Alive"
SCHEDD_Addr = "<192.108.45.8:9618?addrs=192.108.45.8-9618+[2a00-139c-3-2e5-0-21-d2-6c]-9618&alias=c4p-login-dev.gridka.de&noUDP&sock=schedd_2395386_2283>"
SCHEDD_PID = 2395443
SHARED_PORT = "Alive"
SHARED_PORT_Addr = "<192.108.45.8:9618?noUDP&sock=self>"
SHARED_PORT_PID = 2395440

. In the MasterLog, there is only a repetition of this block related to the condor adstash wrapper:

02/15/24 16:48:54 (pid:2395386) (D_ALWAYS:2) Setting maximum accepts per cycle 8.
02/15/24 16:48:54 (pid:2395386) (D_ALWAYS:2) Setting maximum UDP messages per cycle 100.
02/15/24 16:48:54 (pid:2395386) (D_ALWAYS:2) Will use TCP to update collector c4p-htcondor.gridka.de <192.108.45.28:9618?alias=c4p-htcondor.gridka.de>
02/15/24 16:48:54 (pid:2395386) (D_ALWAYS) Adding SHARED_PORT to DAEMON_LIST, because USE_SHARED_PORT=true (to disable this, set AUTO_INCLUDE_SHARED_PORT_IN_DAEMON_LIST=False)
02/15/24 16:48:54 (pid:2395386) (D_ALWAYS) Adding CREDD to DAEMON_LIST. This machine is running a SCHEDD and AUTO_INCLUDE_CREDD_IN_DAEMON_LIST is TRUE)
02/15/24 16:48:54 (pid:2395386) (D_ALWAYS:2) enter Daemons::CheckForNewExecutable
02/15/24 16:48:54 (pid:2395386) (D_ALWAYS:2) Time stamp of running /usr/sbin/condor_master: 1708004804
02/15/24 16:48:54 (pid:2395386) (D_ALWAYS:2) GetTimeStamp returned: 1708004804
02/15/24 16:48:54 (pid:2395386) (D_ALWAYS) Reconfiguring all managed daemons.
02/15/24 16:48:54 (pid:2395386) (D_ALWAYS:2) Send_Signal(): Doing kill(2395446,1) [SIGHUP]
02/15/24 16:48:54 (pid:2395386) (D_ALWAYS) Sent SIGHUP to CREDD (pid 2395446)
02/15/24 16:48:54 (pid:2395386) (D_ALWAYS:2) Send_Signal(): Doing kill(2395443,1) [SIGHUP]
02/15/24 16:48:54 (pid:2395386) (D_ALWAYS) Sent SIGHUP to SCHEDD (pid 2395443)
02/15/24 16:48:54 (pid:2395386) (D_ALWAYS:2) Send_Signal(): Doing kill(2395440,1) [SIGHUP]
02/15/24 16:48:54 (pid:2395386) (D_ALWAYS) Sent SIGHUP to SHARED_PORT (pid 2395440)
02/15/24 16:48:54 (pid:2395386) (D_ALWAYS:2) enter Daemons::UpdateCollector
02/15/24 16:48:54 (pid:2395386) (D_ALWAYS:2) Trying to update collector <192.108.45.28:9618?alias=c4p-htcondor.gridka.de>
02/15/24 16:48:54 (pid:2395386) (D_ALWAYS:2) Attempting to send update via TCP to collector c4p-htcondor.gridka.de <192.108.45.28:9618?alias=c4p-htcondor.gridka.de>
02/15/24 16:48:54 (pid:2395386) (D_ALWAYS:2) File descriptor limits: max 32768, safe 26215
02/15/24 16:48:54 (pid:2395386) (D_ALWAYS:2) exit Daemons::UpdateCollector
02/15/24 16:49:16 (pid:2395386) (D_ALWAYS:2) ::RealStart; ADSTASH > 02/15/24 16:49:16 (pid:2395386) (D_ALWAYS:2) start recover timer (415)
02/15/24 16:49:16 (pid:2395386) (D_ALWAYS) Started process "/opt/condor/py3venv/condor_adstash_wrapper.sh", pid and pgroup = 2398272
02/15/24 16:49:16 (pid:2395386) (D_ALWAYS:2) enter Daemons::UpdateCollector
02/15/24 16:49:16 (pid:2395386) (D_ALWAYS:2) Trying to update collector <192.108.45.28:9618?alias=c4p-htcondor.gridka.de>
02/15/24 16:49:16 (pid:2395386) (D_ALWAYS:2) Attempting to send update via TCP to collector c4p-htcondor.gridka.de <192.108.45.28:9618?alias=c4p-htcondor.gridka.de>
02/15/24 16:49:16 (pid:2395386) (D_ALWAYS:2) exit Daemons::UpdateCollector
02/15/24 16:49:16 (pid:2395386) (D_ALWAYS) PERMISSION DENIED to root@xxxxxxxxx from host 192.108.45.8 for command 60043 (DC_SET_READY), access level DAEMON: reason: DAEMON authorization policy contains no matching ALLOW entry for this request; identifiers used for this host: 192.108.45.8,c4p-login-dev.gridka.de, hostname size = 1, original ip address = 192.108.45.8
02/15/24 16:49:16 (pid:2395386) (D_ALWAYS) DC_AUTHENTICATE: Command not authorized, done!
02/15/24 16:49:16 (pid:2395386) (D_ERROR) The ADSTASH (pid 2398272) exited with status 1
02/15/24 16:49:16 (pid:2395386) (D_ALWAYS) restarting /opt/condor/py3venv/condor_adstash_wrapper.sh in 60 seconds

. In the CredLog, I have some information concerning the CREDMON:

02/15/24 16:56:45 (pid:2395446) (D_ALWAYS:2) CREDD: calling and resetting sweep_timer_handler()
02/15/24 16:56:45 (pid:2395446) (D_ALWAYS:2) CREDMON: scandir(/var/lib/condor/mytoken_credentials)
02/15/24 16:56:45 (pid:2395446) (D_ALWAYS:2) CREDMON: CRED_DIR: /var/lib/condor/mytoken_credentials, MARK: manuel_giffels.mark
02/15/24 16:56:45 (pid:2395446) (D_ALWAYS:2) CREDMON: File manuel_giffels.mark has mtime 1708012356 which is less than 3600 seconds old. Skipping...
02/15/24 16:56:45 (pid:2395446) (D_ALWAYS:2) CREDMON: CRED_DIR: /var/lib/condor/mytoken_credentials, MARK: condor.mark
02/15/24 16:56:45 (pid:2395446) (D_ALWAYS:2) CREDMON: File condor.mark has mtime 1708012356 which is less than 3600 seconds old. Skipping...

So my feedback is somewhat limited, sorry for that.

Thanks a lot again!

Cheers,
ben


On 15/02/2024 15:52, Jason Patton via HTCondor-users wrote:
Hi Ben,

A couple of diagnostics you can check...

Do you still see the CREDD and CREDMON_OAUTH listed if you run "condor_config_val DAEMON_LIST"?

Do the CREDD and CREDMON_OAUTH show up in the output of "condor_who -quick"? For example:

$ condor_who -quick
CREDD = "Alive"
CREDD_Addr = "<snipped>"
CREDD_PID = 799083
CREDMON_OAUTH = "Startup"
CREDMON_OAUTH_PID = 799082
...

Are there any hints in the MasterLog (/var/log/condor/MasterLog) that the credmon is being started and/or its status?

Jason

On Thu, Feb 15, 2024 at 3:40âAM Benoit Roland <benoit.roland@xxxxxxx> wrote:
Dear all,

I have compiled the HTCondor versionÂ23.5.0 using the x86_64_AlmaLinux8-23050000 container [1], adding to the existing code
some plugins to produce [2], monitor and refresh [3,4] Helmhotz AAI access tokens.

The credential monitor [4] is based on the abstract class [5].

While I can successfully run standalone the executables /usr/sbin/condor_producer_mytoken and /usr/sbin/condor_credmon_mytoken,
only the producer is run when sending an condor test job (sleep 1800). It seems like the credmon does not start to run.
Â
My configuration is given by [6].

The credmon used to run successfully before I migrate to 23.5.0.
I don't have anymore the details about the version I was using by then.

I also tried to run the OAUTH credmon, but here gain, the credmon does not start to run when submitting a condor test job.

The main changes wrt my previous code is to make it compliant with the 23.5.0 update of [5].

Running my credmon standalone, I can see that these changes seem to be applied successfully, the credmon is running fine and doing its job.

Would you have any clue about what I would miss?

Thanks a lot in advance for your help!

Cheers,
ben

[1] https://github.com/benoitroland/C4P-HTCondor/blob/devel_rhel8/c4p-condor-utils/build-c4p-condor.sh
[2] https://github.com/benoitroland/C4P-HTCondor/blob/devel_rhel8/src/condor_credd/condor_credmon_oauth/condor_producer_mytoken
[3] https://github.com/benoitroland/C4P-HTCondor/blob/devel_rhel8/src/condor_credd/condor_credmon_oauth/condor_credmon_mytoken
[4] https://github.com/benoitroland/C4P-HTCondor/blob/devel_rhel8/src/condor_credd/condor_credmon_oauth/credmon/CredentialMonitors/MytokenCredmon.py
[5] https://github.com/benoitroland/C4P-HTCondor/blob/devel_rhel8/src/condor_credd/condor_credmon_oauth/credmon/CredentialMonitors/AbstractCredentialMonitor.py
[6] DAEMON_LIST = $(DAEMON_LIST), CREDD, CREDMON_OAUTH

use feature : OAUTH

SEC_PROCESS_SUBMIT_TOKENS = True
SendCredential = True

CREDD_HOST = $(FULL_HOSTNAME)

SEC_DEFAULT_ENCRYPTION = REQUIRED

OAUTH_ISSUER_URL = https://login.helmholtz.de/oauth2/
OAUTH_ISSUER_NAME = helmholtz

MYTOKEN_ISSUER_URL = https://mytoken.data.kit.edu
MYTOKEN_PROFILE = kit/c4p-htcondor

CREDMON_OAUTH = /usr/sbin/condor_credmon_mytoken
CREDMON_OAUTH_DEBUG = D_FULLDEBUG:2

SEC_CREDENTIAL_DIRECTORY_OAUTH = /var/lib/condor/mytoken_credentials
SEC_ENCRYPTION_KEY_DIRECTORY = /etc/condor/encryption.d/ENCRYPTION-KEY

# period at which the credd is checking the remaining life time of stored credentials
CRED_CHECK_INTERVAL = 60

# period at which the collector is updated - default value 5 minutes
CREDD_UPDATE_INTERVAL = 60


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/