[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] HTCondor-C SetEffectiveOwner - Permission denied [SEC=UNCLASSIFIED]



Hi,

 

I’m struggling with HTCondor-C.

 

This was originally working on our system but during the 2 years I have been away something failed and users reverted to using it as a single pool.

Still running 7.8.8 across a number of dedicated linux processors with Windows user submit machines.  I don’t want to upgrade until I find the answer to this issue.

 

If a grid job is submitted it sits locally Idle with: Request has not been considered by the Matchmaker

Gridmanager process keeps starting up, repeatedly failing to set permissions for something? And then exiting.  SchedLog shows something similar.

 

I have Googled my heart out to no avail.  Have re-installed at Windows submit machine.  What is it about the uid’s/permissions?

As I said, jobs submitted as Vanilla rather than Grid to the same remote central manager run as per normal.  The Gahp_worker never fires, so I think it is a problem locally.

 

Can anyone please be of assistance. 

 

Troy

 

 

GridmanagerLog:

...

02/11/14 14:23:40 [7608] TokenCache contents:

troy@domain

02/11/14 14:23:40 [7608] DaemonCore: in SendAliveToParent()

02/11/14 14:23:40 [7608] DaemonCore::IsPidAlive(): OpenProcess failed

02/11/14 14:23:40 [7608] DaemonCore: in SendAliveToParent() - ppid 4740l disappeared!

02/11/14 14:23:40 [7608] Checking proxies

02/11/14 14:23:43 [7608] Initialized the following authorization table:

02/11/14 14:23:43 [7608] Authorizations yet to be resolved:

02/11/14 14:23:43 [7608] allow READ:  */* */*

02/11/14 14:23:43 [7608] allow WRITE:  */* */local@xxxxx */147.66.85.62 */147.66.85.62

02/11/14 14:23:43 [7608] allow NEGOTIATOR:  */ local@xxxxx */147.66.85.62 */147.66.85.62

02/11/14 14:23:43 [7608] allow ADMINISTRATOR:  */ local@xxxxx */147.66.85.62 */147.66.85.62

02/11/14 14:23:43 [7608] allow OWNER:  */ local@xxxxx */NEW-50985.aad.gov.au */147.66.85.62 */147.66.85.62 */147.66.85.62

02/11/14 14:23:43 [7608] allow DAEMON:  */* */ local@xxxxx */147.66.85.62 */147.66.85.62

02/11/14 14:23:43 [7608] allow ADVERTISE_STARTD:  */* */ local@xxxxx */147.66.85.62 */147.66.85.62

02/11/14 14:23:43 [7608] allow ADVERTISE_SCHEDD:  */* */ local@xxxxx */147.66.85.62 */147.66.85.62

02/11/14 14:23:43 [7608] allow ADVERTISE_MASTER:  */* */ local@xxxxx */147.66.85.62 */147.66.85.62

02/11/14 14:23:43 [7608] Received ADD_JOBS signal

02/11/14 14:23:43 [7608] in doContactSchedd()

02/11/14 14:23:43 [7608] TokenCache contents:

troy@domain

02/11/14 14:23:43 [7608] SetEffectiveOwner(troy@domain) failed with errno=13: Permission denied.

02/11/14 14:23:43 [7608] Failed to connect to schedd! Will retry

02/11/14 14:23:45 [7608] Evaluating staleness of remote job statuses.

02/11/14 14:23:48 [7608] in doContactSchedd()

02/11/14 14:23:48 [7608] TokenCache contents:

troy@domain

...[SNIP]...

02/11/14 14:24:23 [7608] SetEffectiveOwner(troy@domain) failed with errno=13: Permission denied.

02/11/14 14:24:23 [7608] Failed to connect to schedd! Will retry

02/11/14 14:24:28 [7608] in doContactSchedd()

02/11/14 14:24:28 [7608] TokenCache contents:

troy@domain

02/11/14 14:24:28 [7608] SetEffectiveOwner(troy@domain) failed with errno=13: Permission denied.

02/11/14 14:24:28 [7608] Failed to connect to schedd!

02/11/14 14:24:28 [7608] ERROR "Too many failures connecting to schedd!" at line 1246 in file c:\condor\execute\dir_11160\userdir\src\condor_gridmanager\gridmanager.cpp

02/11/14 14:28:40 init_user_ids: want user ‘troy@domain’, current is '(null)@(null)'

02/11/14 14:28:40 Found credential for user troy@domain’

02/11/14 14:28:40 LogonUser completed.

 

SchedLog:

...

02/11/14 13:36:11 (pid:4740) SetEffectiveOwner security violation: setting owner to troy@domain when active owner is "SYSTEM"

02/11/14 13:36:12 (pid:4740) Number of Active Workers 0

02/11/14 13:36:14 (pid:4740) Number of Active Workers 0

02/11/14 13:36:15 (pid:4740) Number of Active Workers 0

02/11/14 13:36:16 (pid:4740) SetEffectiveOwner security violation: setting owner to troy@domain when active owner is "SYSTEM"

02/11/14 13:36:17 (pid:4740) Number of Active Workers 0

02/11/14 13:36:18 (pid:4740) Number of Active Workers 0

02/11/14 13:36:20 (pid:4740) Number of Active Workers 0

02/11/14 13:36:21 (pid:4740) SetEffectiveOwner security violation: setting owner to troy@domain when active owner is "SYSTEM"

02/11/14 13:36:21 (pid:4740) condor_gridmanager (PID 7652, owner troy) exited with return code 4.

02/11/14 13:36:21 (pid:4740) Number of Active Workers 0

 

 

Condor_config.local:

...

UID_DOMAIN = $(FULL_HOSTNAME)

 

#TRUST_UID_DOMAIN = TRUE

 

HOSTALLOW_READ = *

HOSTALLOW_WRITE = *

 

##  Daemons

DAEMON_LIST=MASTER SCHEDD COLLECTOR NEGOTIATOR

 

## GRID PARAMS

CONDOR_GAHP = $(SBIN)/condor_c-gahp

GRIDMANAGER_LOG = $(LOG)/GridLogs/GridmanagerLog.$(USERNAME)

C_GAHP_LOG = $(LOG)/GridLogs/CGAHPLog.$(USERNAME)

C_GAHP_WORKER_THREAD_LOG = $(LOG)/GridLogs/CGAHPWorkerLog.$(USERNAME)

 

## DEBUGGING

GRIDMANAGER_DEBUG              = D_FULLDEBUG

C_GAHP_DEBUG = D_FULLDEBUG

C_GAHP_WORKER_THREAD_DEBUG = D_FULLDEBUG

 

## Security

SEC_DEFAULT_NEGOTIATION = OPTIONAL

SEC_DEFAULT_AUTHENTICATION_METHODS = CLAIMTOBE

 

 

Submit file:

Universe = grid

Executable = R

transfer_executable = False

Arguments = --version

Error = Error_$(Cluster).$(Process).txt

Output = Output_$(Cluster).$(Process).txt

Log = Condor_log.txt

should_transfer_files = True

when_to_transfer_output = ON_EXIT

grid_resource = condor server1.a.b.c server1.a.b.c

+remote_requirements = Arch == "X86_64" && OpSys == "LINUX"

+remote_universe = vanilla

+remote_shouldtransferfiles = "YES"

+remote_whentotransferoutput = "ON_EXIT"

Queue

 

___________________________________________________________________________

    Australian Antarctic Division - Commonwealth of Australia
IMPORTANT: This transmission is intended for the addressee only. If you are not the
intended recipient, you are notified that use or dissemination of this communication is
strictly prohibited by Commonwealth law. If you have received this transmission in error,
please notify the sender immediately by e-mail or by telephoning +61 3 6232 3209 and
DELETE the message.
        Visit our web site at http://www.antarctica.gov.au/
___________________________________________________________________________