[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] condor_ssh_to_job
- Date: Wed, 21 Aug 2013 20:42:59 +0000
- From: "Shrum, Donald C" <DCShrum@xxxxxxxxxxxxx>
- Subject: Re: [HTCondor-users] condor_ssh_to_job
Hi Todd and thanks for the help on this...
Indeed and as we suspected the job is running as nobody.
On my processing node in condor_config I have the line:
STARTER_ALLOW_RUNAS_OWNER = TRUE
In my job submission I have the line:
RunAsOwner = True
I was thinking that with those settings; and of course the users exists on both the submit node and the processing node... that the job would run as dcshrum instead of nobody and I would not need to have a shell enabled for nobody. If I do include a shell for nobody then everything does work.
[dcshrum@condor-login vanilla]$ condor_ssh_to_job 5
Welcome to slot1@xxxxxxxxxxxxxxx!
Your condor job is running with pid(s) 23397.
From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Todd Tannenbaum
Sent: Wednesday, August 21, 2013 3:52 PM
To: HTCondor-Users Mail List
Subject: Re: [HTCondor-users] condor_ssh_to_job
On 8/21/2013 2:20 PM, Shrum, Donald C wrote:
> Thanks for the replies....
> Our condor cluster is a dedicated cluster and I am the sysadmin and perfectly happy for users to ssh_to_job to see what is going on.
> User home directories are not on the processing nodes but the logins are enabled and shells present.
> [dcshrum@condor-login vanilla]$ condor_q
> -- Submitter: condor-login.local : <10.178.6.3:43563> : condor-login.local
> ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
> 2.0 dcshrum 8/21 15:12 0+00:00:01 R 0 0.0 a.out
> [dcshrum@condor-login vanilla]$ condor_ssh_to_job 2 This account is
> currently not available.
> Connection to condor-job.condor-29.local closed.
> [dcshrum@condor-login vanilla]$ ssh dcshrum@condor-29
> Warning: Permanently added 'condor-29,10.178.6.29' (RSA) to the list of known hosts.
> dcshrum@condor-29's password:
> -sh-4.1$ logout
> Connection to condor-29 closed.
> There must be a setting I am missing.
See my previous post...
So did you configure HTCondor to run jobs as user nobody on the execute machines? If so, ssh to an execute machine (condor-29) and do
grep ^nobody /etc/passwd
Does this account exist on your execute machine? If not, there is your problem. If it does exist, what shell (last field) is specified? Is it something like /bin/nologin? If so that is your problem - it needs to be a shell listed in /etc/shells.
p.s. Note instead of user "nobody" you may want to define "slot users"
so jobs running on different slots use different UIDs - see http://research.cs.wisc.edu/htcondor/manual/v8.0/3_6Security.html#sec:RunAsNobody
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
You can also unsubscribe by visiting
The archives can be found at: