[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Windows credd, pool_password, run_as_owner all working, but not with Encrypt_Execute_Directory



Hi All

 

We have recently gone from testing into production using a pool_password for authentication, having a credential server credd running,

and getting users to run_as_owner. This all works fine (after a couple of unexpected gotchas).

 

I have now gone back to looking again at encrypt_execute_directory (on the submit side). We had previously tested this before enabling

run_as_owner and things worked fine so long as you allow for non-windows fileservers that store input and output data, and use the cipher

command before uploading output.

 

This testing was done before the run_as_owner option was in production. It was tested though, but using test VM execute nodes that I created

that had me in the admin group.

 

Without run_as_owner the StarterLog.slot1 log file has entries like:

 

03/03/22 09:59:50 setting the orig job name in starter

03/03/22 09:59:50 setting the orig job iwd in starter

03/03/22 09:59:50 Encrypting execute directory "C:\PROGRA~1\condor\execute\dir_13488" to user condor-slot1

03/03/22 09:59:50 Loaded Registry hives for condor-slot1

03/03/22 09:59:50 Chirp config summary: IO false, Updates false, Delayed updates true.

03/03/22 09:59:50 Initialized IO Proxy.

03/03/22 09:59:50 Setting resource limits not implemented!

03/03/22 09:59:50 File transfer completed successfully.

03/03/22 09:59:51 Job 251.7 set to execute immediately

03/03/22 09:59:51 Starting a VANILLA universe job with ID: 251.7

03/03/22 09:59:51 Tracking process family by login "condor-slot1"

 

With run_as_owner the entries show:

 

03/03/22 19:02:49 setting the orig job name in starter

03/03/22 19:02:49 setting the orig job iwd in starter

03/03/22 19:02:49 Encrypting execute directory "C:\PROGRA~1\condor\execute\dir_12664" to user hit023

03/03/22 19:02:49 Chirp config summary: IO false, Updates false, Delayed updates true.

03/03/22 19:02:49 IOProxy: couldn't write to C:\PROGRA~1\condor\execute\dir_12664\.chirp.config: Permission denied

03/03/22 19:02:49 Couldn't initialize IO Proxy.

03/03/22 19:02:49 Setting resource limits not implemented!

03/03/22 19:02:49 get_file(): Failed to open file C:\PROGRA~1\condor\execute\dir_12664\condor_exec.exe, errno = 13: Permission denied.

03/03/22 19:02:49 get_file(): consumed 1803 bytes of file transmission

03/03/22 19:02:49 DoDownload: consuming rest of transfer and failing after encountering the following error: STARTER at 152.83.115.17 failed to write to file C:\PROGRA~1\condor\execute\dir_12664\condor_exec.exe: (errno 13) Permission denied

03/03/22 19:02:49 Failed to set execute bit on C:\PROGRA~1\condor\execute\dir_12664\condor_exec.exe, errno=2 (No such file or directory)

03/03/22 19:02:49 File transfer failed (status=0).

03/03/22 19:02:49 ERROR "Failed to transfer files" at line 2468 in file D:\execute\dir_10492\sources\src\condor_starter.V6.1\jic_shadow.cpp

03/03/22 19:02:49 ShutdownFast all jobs.

03/03/22 19:02:49 Failed to open '.update.ad' to read update ad: No such file or directory (2).

03/03/22 19:02:49 condor_read(): Socket closed abnormally when trying to read 21 bytes from <152.83.3.21:61271>, errno=10054

 

In production though the execute nodes (all in a single AD domain) have in their “users” group “ourdomain\Domain Users” which includes

all our HTCondor users. The allow permissions on the condor “execute” folder on the execute nodes are:

Read & execute

List folder contents

Read

 

There is no allow for:

Full control

Modify

Write

 

For testing I manually added Full control, Modify, and Write permissions on a single execute node but the errors are the same.

SYSTEM on the execute node has full control of the execute folder as well.

 

Thanks for any info/insights/suggestions/comments.

 

Cheers

 

Greg