[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] possible regression with condor_ssh_to_job





On 7/30/11 7:44 AM, Rita wrote:
Can someone please try out these use cases? 

* I have a policy if the job goes over the request_memory the job should get held. Therefore, I submit a job with request_memory = 1024 and then condor_ssh_to_job <jid>. While in the ssh session start up another program which consumes more than 1024 megabytes of memory and leave the programming running. Nothing gets held. 

Are you using periodic_hold?  For running jobs, that is evaluated in the condor_shadow process.  The update of ImageSize from the starter to the shadow happens every STARTER_UPDATE_INTERVAL seconds (default 300).  Did you wait that long?


* Submit a job which takes 60 seconds (sleep 60). condor_ssh_to_job <jid> and then occupy the session. Even after the `sleep 60` is completed I am still logged into the session. I should get kicked out.

This is expected behavior.  condor_ssh_to_job processes are treated like job processes.  The condor_starter does not shut down as long as any of these processes continue running.  This allows the user to debug things in the immediate aftermath of the job.  All the usual mechanisms for evicting jobs apply to ssh sessions as well: PREEMPT, RANK, PREEMPTION_REQUIREMENTS, condor_vacate_job, condor_vacate, condor_rm.  In addition, if there is a concern about users forgetting they have an ssh session open, most shells provide an auto-logout feature that can disconnect after X amount of idle time.

--Dan




Any thoughts? 



--
--- Get your facts first, then you can distort them as you please.--


_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/