[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] HTCondor-C SetEffectiveOwner - Permission denied [SEC=UNCLASSIFIED]



Thanks for your reply Brian,

I tried your suggestion.  Still no go, although the SchedLog now has the following after the IO: EOF line:

condor_gridmanager (PID 8376, owner troy) exited with return code 4.
02/18/14 14:11:14 (pid:5504) init_user_ids: want user 'troy@mydomain', current is 'troy@mydomain'
02/18/14 14:11:14 (pid:5504) init_user_ids: Already have handle for troy@mydomain, so returning.
02/18/14 14:11:14 (pid:5504) TokenCache contents: 
troy@domain
02/18/14 14:11:14 (pid:5504) Removed scratch dir C:\Windows\TEMP\condor_g_scratch.017CA080.5504

Nothing else extra useful in gridmanager log

Thanks though

Troy

On Tuesday, 18 February 2014 1:13 PM, Brian Bockelman wrote:
> 
> Hi Troy,
> 
> Try this:
> 
> QUEUE_SUPER_USER_MAY_IMPERSONATE = .*
> 
> on the submitter.
> 
> Brian
> 
> On Feb 16, 2014, at 6:53 PM, Troy Robertson <Troy.Robertson@xxxxxxxxxx>
> wrote:
> 
> > Still following up on this problem,
> >
> > Below is a few lines from the submit machines SchedLog which I am
> hoping might be enlightening to someone because I have no idea.
> > With FULLDEBUG the line about Queue super user now stands out.
> > I have uninstalled/reinstalled, new config files, removed directory,
> all to no avail.
> > Job submission works fine as a Personal Pool (Windows 7), and as
> Vanilla submit in a wider pool (all linux).
> > It is just under Condor-C with submitter configured as a Personal
> Pool and performing a grid job submission that this authentication
> problem crops up.
> >
> > Why?
> >
> > Troy
> >
> >
> > SCHEDLOG
> > 02/17/14 10:58:16 (pid:5504) Received TCP command 1111
> > (QMGMT_READ_CMD) from unauthenticated@unmapped <147.66.85.78:34702>,
> > access level READ
> > 02/17/14 10:58:16 (pid:5504) Number of Active Workers 0
> > 02/17/14 10:58:16 (pid:5504) QMGR Connection closed
> > 02/17/14 10:58:17 (pid:5504) Received TCP command 1112
> > (QMGMT_WRITE_CMD) from SYSTEM <147.66.85.78:34708>, access level
> WRITE
> > 02/17/14 10:58:17 (pid:5504) Queue super user not allowed to set
> owner to troy@mydomain, because this instance of the schedd has never
> seen that user submit any jobs.
> > 02/17/14 10:58:17 (pid:5504) SetEffectiveOwner security violation:
> setting owner to troy@mydomain when active owner is "SYSTEM"
> > 02/17/14 10:58:17 (pid:5504) condor_read(): Socket closed when trying
> > to read 5 bytes from <147.66.85.78:34708>
> > 02/17/14 10:58:17 (pid:5504) IO: EOF reading packet header
> > 02/17/14 10:58:17 (pid:5504) QMGR Connection closed
> > 02/17/14 10:58:17 (pid:5504) Received TCP command 1111
> > (QMGMT_READ_CMD) from unauthenticated@unmapped <147.66.85.78:34715>,
> > access level READ
> > 02/17/14 10:58:17 (pid:5504) Number of Active Workers 0
> > 02/17/14 10:58:17 (pid:5504) QMGR Connection closed
> > 02/17/14 10:58:19 (pid:5504) Received TCP command 1111
> > (QMGMT_READ_CMD) from unauthenticated@unmapped <147.66.85.78:34732>,
> > access level READ
> >
> >
> > ---------------------------------------------------------------------
> -
> > -----------------------------
> >> On Tue, 11 Feb 2014 14:40:15 Troy Robertson wrote:
> >>
> >> Hi,
> >>
> >> I'm struggling with HTCondor-C.
> >>
> >> This was originally working on our system but during the 2 years I
> >> have been away something failed and users reverted to using it as a
> >> single pool.
> >> Still running 7.8.8 across a number of dedicated linux processors
> >> with Windows user submit machines.  I don't want to upgrade until I
> >> find the answer to this issue.
> >>
> >> If a grid job is submitted it sits locally Idle with: Request has
> not
> >> been considered by the Matchmaker Gridmanager process keeps starting
> >> up, repeatedly failing to set permissions for something? And then
> >> exiting.  SchedLog shows something similar.
> >>
> >> I have Googled my heart out to no avail.  Have re-installed at
> >> Windows submit machine.  What is it about the uid's/permissions?
> >> As I said, jobs submitted as Vanilla rather than Grid to the same
> >> remote central manager run as per normal.  The Gahp_worker never
> >> fires, so I think it is a problem locally.
> >>
> >> Can anyone please be of assistance.
> >>
> >> Troy
> >>
> >>
> >> GridmanagerLog:
> >> ...
> >> 02/11/14 14:23:40 [7608] TokenCache contents:
> >> troy@domain
> >> 02/11/14 14:23:40 [7608] DaemonCore: in SendAliveToParent()
> >> 02/11/14 14:23:40 [7608] DaemonCore::IsPidAlive(): OpenProcess
> failed
> >> 02/11/14 14:23:40 [7608] DaemonCore: in SendAliveToParent() - ppid
> >> 4740l disappeared!
> >> 02/11/14 14:23:40 [7608] Checking proxies
> >> 02/11/14 14:23:43 [7608] Initialized the following authorization
> table:
> >> 02/11/14 14:23:43 [7608] Authorizations yet to be resolved:
> >> 02/11/14 14:23:43 [7608] allow READ:  */* */*
> >> 02/11/14 14:23:43 [7608] allow WRITE:  */* */local@xxxxx
> >> */147.66.85.62
> >> */147.66.85.62
> >> 02/11/14 14:23:43 [7608] allow NEGOTIATOR:  */ local@xxxxx
> >> */147.66.85.62 */147.66.85.62
> >> 02/11/14 14:23:43 [7608] allow ADMINISTRATOR:  */ local@xxxxx
> >> */147.66.85.62 */147.66.85.62
> >> 02/11/14 14:23:43 [7608] allow OWNER:  */ local@xxxxx */NEW-
> >> 50985.aad.gov.au */147.66.85.62 */147.66.85.62 */147.66.85.62
> >> 02/11/14 14:23:43 [7608] allow DAEMON:  */* */ local@xxxxx
> >> */147.66.85.62 */147.66.85.62
> >> 02/11/14 14:23:43 [7608] allow ADVERTISE_STARTD:  */* */ local@xxxxx
> >> */147.66.85.62 */147.66.85.62
> >> 02/11/14 14:23:43 [7608] allow ADVERTISE_SCHEDD:  */* */ local@xxxxx
> >> */147.66.85.62 */147.66.85.62
> >> 02/11/14 14:23:43 [7608] allow ADVERTISE_MASTER:  */* */ local@xxxxx
> >> */147.66.85.62 */147.66.85.62
> >> 02/11/14 14:23:43 [7608] Received ADD_JOBS signal
> >> 02/11/14 14:23:43 [7608] in doContactSchedd()
> >> 02/11/14 14:23:43 [7608] TokenCache contents:
> >> troy@domain
> >> 02/11/14 14:23:43 [7608] SetEffectiveOwner(troy@domain) failed with
> >> errno=13: Permission denied.
> >> 02/11/14 14:23:43 [7608] Failed to connect to schedd! Will retry
> >> 02/11/14 14:23:45 [7608] Evaluating staleness of remote job
> statuses.
> >> 02/11/14 14:23:48 [7608] in doContactSchedd()
> >> 02/11/14 14:23:48 [7608] TokenCache contents:
> >> troy@domain
> >> ...[SNIP]...
> >> 02/11/14 14:24:23 [7608] SetEffectiveOwner(troy@domain) failed with
> >> errno=13: Permission denied.
> >> 02/11/14 14:24:23 [7608] Failed to connect to schedd! Will retry
> >> 02/11/14 14:24:28 [7608] in doContactSchedd()
> >> 02/11/14 14:24:28 [7608] TokenCache contents:
> >> troy@domain
> >> 02/11/14 14:24:28 [7608] SetEffectiveOwner(troy@domain) failed with
> >> errno=13: Permission denied.
> >> 02/11/14 14:24:28 [7608] Failed to connect to schedd!
> >> 02/11/14 14:24:28 [7608] ERROR "Too many failures connecting to
> >> schedd!" at line 1246 in file
> >>
> c:\condor\execute\dir_11160\userdir\src\condor_gridmanager\gridmanager.
> >> cpp
> >> 02/11/14 14:28:40 init_user_ids: want user 'troy@domain', current is
> >> '(null)@(null)'
> >> 02/11/14 14:28:40 Found credential for user troy@domain'
> >> 02/11/14 14:28:40 LogonUser completed.
> >>
> >> SchedLog:
> >> ...
> >> 02/11/14 13:36:11 (pid:4740) SetEffectiveOwner security violation:
> >> setting owner to troy@domain when active owner is "SYSTEM"
> >> 02/11/14 13:36:12 (pid:4740) Number of Active Workers 0
> >> 02/11/14 13:36:14 (pid:4740) Number of Active Workers 0
> >> 02/11/14 13:36:15 (pid:4740) Number of Active Workers 0
> >> 02/11/14 13:36:16 (pid:4740) SetEffectiveOwner security violation:
> >> setting owner to troy@domain when active owner is "SYSTEM"
> >> 02/11/14 13:36:17 (pid:4740) Number of Active Workers 0
> >> 02/11/14 13:36:18 (pid:4740) Number of Active Workers 0
> >> 02/11/14 13:36:20 (pid:4740) Number of Active Workers 0
> >> 02/11/14 13:36:21 (pid:4740) SetEffectiveOwner security violation:
> >> setting owner to troy@domain when active owner is "SYSTEM"
> >> 02/11/14 13:36:21 (pid:4740) condor_gridmanager (PID 7652, owner
> >> troy) exited with return code 4.
> >> 02/11/14 13:36:21 (pid:4740) Number of Active Workers 0
> >>
> >>
> >> Condor_config.local:
> >> ...
> >> UID_DOMAIN = $(FULL_HOSTNAME)
> >>
> >> #TRUST_UID_DOMAIN = TRUE
> >>
> >> HOSTALLOW_READ = *
> >> HOSTALLOW_WRITE = *
> >>
> >> ##  Daemons
> >> DAEMON_LIST=MASTER SCHEDD COLLECTOR NEGOTIATOR
> >>
> >> ## GRID PARAMS
> >> GRIDMANAGER_LOG = $(LOG)/GridLogs/GridmanagerLog.$(USERNAME)
> >> C_GAHP_LOG = $(LOG)/GridLogs/CGAHPLog.$(USERNAME)
> >> C_GAHP_WORKER_THREAD_LOG =
> $(LOG)/GridLogs/CGAHPWorkerLog.$(USERNAME)
> >>
> >> ## DEBUGGING
> >> GRIDMANAGER_DEBUG          = D_FULLDEBUG
> >> C_GAHP_DEBUG = D_FULLDEBUG
> >> C_GAHP_WORKER_THREAD_DEBUG = D_FULLDEBUG
> >>
> >> ## Security
> >> SEC_DEFAULT_NEGOTIATION = OPTIONAL
> >> SEC_DEFAULT_AUTHENTICATION_METHODS = CLAIMTOBE
> >>
> >>
> >> Submit file:
> >> Universe = grid
> >> Executable = /usr/bin/R
> >> transfer_executable = False
> >> Arguments = --version
> >> Error = Error_$(Cluster).$(Process).txt Output =
> >> Output_$(Cluster).$(Process).txt Log = Condor_log.txt
> >> should_transfer_files = ALWAYS when_to_transfer_output = ON_EXIT
> >> grid_resource = condor server1.a.b.c server1.a.b.c
> >> +remote_requirements = Arch == "X86_64" && OpSys == "LINUX"
> >> +remote_universe = vanilla
> >> +remote_shouldtransferfiles = "YES"
> >> +remote_whentotransferoutput = "ON_EXIT"
> >> Queue
> >>
> >
> >
> ______________________________________________________________________
> > _____
> >
> >    Australian Antarctic Division - Commonwealth of Australia
> > IMPORTANT: This transmission is intended for the addressee only. If
> > you are not the intended recipient, you are notified that use or
> > dissemination of this communication is strictly prohibited by
> > Commonwealth law. If you have received this transmission in error,
> > please notify the sender immediately by e-mail or by telephoning +61
> 3 6232 3209 and DELETE the message.
> >        Visit our web site at http://www.antarctica.gov.au/
> >
> ______________________________________________________________________
> > _____
> >
> > _______________________________________________
> > HTCondor-users mailing list
> > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
> > with a
> > subject: Unsubscribe
> > You can also unsubscribe by visiting
> > https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> >
> > The archives can be found at:
> > https://lists.cs.wisc.edu/archive/htcondor-users/
> 
> 
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
> with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/
___________________________________________________________________________

    Australian Antarctic Division - Commonwealth of Australia
IMPORTANT: This transmission is intended for the addressee only. If you are not the
intended recipient, you are notified that use or dissemination of this communication is
strictly prohibited by Commonwealth law. If you have received this transmission in error,
please notify the sender immediately by e-mail or by telephoning +61 3 6232 3209 and
DELETE the message.
        Visit our web site at http://www.antarctica.gov.au/
___________________________________________________________________________