[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Inconsistent execute dir permissions



Hi all,

We're having an issue and I'm wondering if you can provide guidance.

Setup is partitionable slots, as typical, with slot users:

DEDICATED_EXECUTE_ACCOUNT_REGEXP = slot.+
STARTER_ALLOW_RUNAS_OWNER = False
SLOT1_1_USER = slot1
SLOT1_2_USER = slot2
SLOT1_3_USER = slot3
SLOT1_4_USER = slot4
<etc>

But on the nodes, I see inconsistent execute directory ownership, sometimes a mix of slot users and condor. Other times all owned by condor.

I'm seeing job errors that are consistent with failure to read in those directories by the job running as the user.

[root@ip-10-153-131-168 ~]# ls -alh /home/condor/execute
total 56K
drwxr-xr-x. 6 condor condor 4.0K Mar 16 22:08 .
drwxr-xr-x. 3 condor condor 4.0K Mar 10 13:09 ..
drwx------. 7 condor condor  12K Mar 16 21:49 dir_1043940
drwx------. 7 condor condor  12K Mar 16 22:06 dir_1062269
drwx------. 7 slot4  slot4   12K Mar 16 22:08 dir_1064108
drwx------. 7 slot3  slot3   12K Mar 16 22:09 dir_1064289

[root@ip-10-121-2-98 ~]# ls -alh /home/condor/execute/
total 56K
drwxr-xr-x. 6 condor condor 4.0K Mar 16 22:00 .
drwxr-xr-x. 3 condor condor 4.0K Mar 10 13:09 ..
drwx------. 7 condor condor  12K Mar 16 21:56 dir_1466019
drwx------. 7 condor condor  12K Mar 16 22:02 dir_1467286
drwx------. 7 condor condor  12K Mar 16 22:02 dir_1467287
drwx------. 7 condor condor  12K Mar 16 22:02 dir_1467288

Any idea how this would be happening? Log entries to look for? Ever seen it before? Any config changes to try?

Thanks,

--john

--
John Hover
Group Leader | Grid Group/Experiment Services
RHIC/ATLAS Computing Facility | Brookhaven National Laboratory
jhover@xxxxxxx | 631-344-5828 | http://www.racf.bnl.gov/Members/jhover