Re: [Condor-users] possible regression with condor_ssh_to_job

On 7/30/11 7:44 AM, Rita wrote:
Can someone please try out these use cases? 

* I have a policy if the job goes over the request_memory the job should get held. Therefore, I submit a job with request_memory = 1024 and then condor_ssh_to_job <jid>. While in the ssh session start up another program which consumes more than 1024 megabytes of memory and leave the programming running. Nothing gets held. 

Are you using periodic_hold?  For running jobs, that is evaluated in the condor_shadow process.  The update of ImageSize from the starter to the shadow happens every STARTER_UPDATE_INTERVAL seconds (default 300).  Did you wait that long?

* Submit a job which takes 60 seconds (sleep 60). condor_ssh_to_job <jid> and then occupy the session. Even after the `sleep 60` is completed I am still logged into the session. I should get kicked out.

This is expected behavior.  condor_ssh_to_job processes are treated like job processes.  The condor_starter does not shut down as long as any of these processes continue running.  This allows the user to debug things in the immediate aftermath of the job.  All the usual mechanisms for evicting jobs apply to ssh sessions as well: PREEMPT, RANK, PREEMPTION_REQUIREMENTS, condor_vacate_job, condor_vacate, condor_rm.  In addition, if there is a concern about users forgetting they have an ssh session open, most shells provide an auto-logout feature that can disconnect after X amount of idle time.


Any thoughts? 

