[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] condor_ssh_to_job
- Date: Wed, 21 Aug 2013 14:51:35 -0500
- From: Todd Tannenbaum <tannenba@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] condor_ssh_to_job
On 8/21/2013 2:20 PM, Shrum, Donald C wrote:
Thanks for the replies....
Our condor cluster is a dedicated cluster and I am the sysadmin and perfectly happy for users to ssh_to_job to see what is going on.
User home directories are not on the processing nodes but the logins are enabled and shells present.
[dcshrum@condor-login vanilla]$ condor_q
-- Submitter: condor-login.local : <10.178.6.3:43563> : condor-login.local
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
2.0 dcshrum 8/21 15:12 0+00:00:01 R 0 0.0 a.out
[dcshrum@condor-login vanilla]$ condor_ssh_to_job 2
This account is currently not available.
Connection to condor-job.condor-29.local closed.
[dcshrum@condor-login vanilla]$ ssh dcshrum@condor-29
Warning: Permanently added 'condor-29,10.178.6.29' (RSA) to the list of known hosts.
Connection to condor-29 closed.
There must be a setting I am missing.
See my previous post...
So did you configure HTCondor to run jobs as user nobody on the execute
machines? If so, ssh to an execute machine (condor-29) and do
grep ^nobody /etc/passwd
Does this account exist on your execute machine? If not, there is your
problem. If it does exist, what shell (last field) is specified? Is it
something like /bin/nologin? If so that is your problem - it needs to
be a shell listed in /etc/shells.
p.s. Note instead of user "nobody" you may want to define "slot users"
so jobs running on different slots use different UIDs - see