[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] condor_ssh_to_job



Hi Todd and thanks for the help on this...

Indeed and as we suspected the job is running as nobody.  

On my processing node in condor_config I have the line:
STARTER_ALLOW_RUNAS_OWNER = TRUE

In my job submission I have the line:
RunAsOwner      = True

I was thinking that with those settings; and of course the users exists on both the submit node and the processing node... that the job would run as dcshrum instead of nobody and I would not need to have a shell enabled for nobody.  If I do include a shell for nobody then everything does work.  

--Donny

 [dcshrum@condor-login vanilla]$ condor_ssh_to_job 5
Welcome to slot1@xxxxxxxxxxxxxxx!
Your condor job is running with pid(s) 23397.
bash-4.1$ pwd
/var/lib/condor/execute/dir_23394


-----Original Message-----
From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Todd Tannenbaum
Sent: Wednesday, August 21, 2013 3:52 PM
To: HTCondor-Users Mail List
Subject: Re: [HTCondor-users] condor_ssh_to_job

On 8/21/2013 2:20 PM, Shrum, Donald C wrote:
> Thanks for the replies....
>
> Our condor cluster is a dedicated cluster and I am the sysadmin and perfectly happy for users to ssh_to_job to see what is going on.
> User home directories are not on the processing nodes but the logins are enabled and shells present.
>
> [dcshrum@condor-login vanilla]$ condor_q
> -- Submitter: condor-login.local : <10.178.6.3:43563> : condor-login.local
>   ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
>     2.0   dcshrum         8/21 15:12   0+00:00:01 R  0   0.0  a.out
>
>   [dcshrum@condor-login vanilla]$ condor_ssh_to_job 2 This account is 
> currently not available.
> Connection to condor-job.condor-29.local closed.
>
>
> [dcshrum@condor-login vanilla]$ ssh dcshrum@condor-29
> Warning: Permanently added 'condor-29,10.178.6.29' (RSA) to the list of known hosts.
> dcshrum@condor-29's password:
> -sh-4.1$ logout
> Connection to condor-29 closed.
>
> There must be a setting I am missing.

See my previous post...
So did you configure HTCondor to run jobs as user nobody on the execute machines?  If so, ssh to an execute machine (condor-29) and do
    grep ^nobody /etc/passwd
Does this account exist on your execute machine? If not, there is your problem.  If it does exist, what shell (last field) is specified? Is it something like /bin/nologin?  If so that is your problem - it needs to be a shell listed in /etc/shells.

p.s. Note instead of user "nobody" you may want to define "slot users" 
so jobs running on different slots use different UIDs - see http://research.cs.wisc.edu/htcondor/manual/v8.0/3_6Security.html#sec:RunAsNobody

-Todd

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/