[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] condor_ssh_to_job



so, if a user submit a job with request_memory=1000 and I have a policy where if the user goes over 1000 megabytes the job gets held; my process is at 900 megabytes and I run something stupid and goes over 100 megabytes the job gets held? Is that the correct behavior? 



On Wed, Jul 27, 2011 at 11:33 AM, Dan Bradley <dan@xxxxxxxxxxxx> wrote:

For better or worse, all processes run by the user on the execute node are run and monitored as though they were part of the job.  Therefore, policies relating to cpu affinity, cpu usage, memory usage, and so on should all be applied to the ssh session, just like any other processes run by the user in a job.

--Dan


On 7/27/11 5:36 AM, Rita wrote:
Can people take advantage of condor_ssh_to_jobs ?

Can´t they login to the box and run something else which will take additional resources? Or is there mechanism which will prevent that? 


On Tue, Jul 19, 2011 at 4:47 PM, Sassy Natan <sassyn@xxxxxxxxx> wrote:
Hi Dan,

Well, I tried to play a little we the screen command, and found out there are many issues while using it.
I guess it is not a good solution as you pointed out.

I still think however, that it is important feature to have a control on your running process. Not only killing, holding etc... but the ability to stole it to your own console. 

I know I can have all the files  and logs the process created, I know also I can sniff them on the execute machine. I do not know however, how can I get interact will the running process control by condor.

I will send a new email asking if it maybe possible to change the directory where the process is running?

So when condor run the job on execute machine, the running dir will not be created in the /var/lib/condor/execute directory (the default in RedHat RPMs) but in let's say the home user directory?

For Example:
If user foo.bar is submitting a job, the running dir of the job will be located in a NFS shared volume, accessible to all execute machine instead on a limit local disk size located @ /var/lib/condor.


Thanks
Sassy   

On Tue, Jul 19, 2011 at 1:01 AM, Dan Bradley <dan@xxxxxxxxxxxx> wrote:
Sassy,

condor_ssh_to_job gives you an interactive shell on the same machine and in the same environment as the running job.  This allows you to inspect the process in many ways, but it does not attach the i/o streams of the job to your terminal as though you had run the job by hand.  The i/o streams of the job are directed to files or streams and cannot be easily redirected to something else unless you go through some extra effort when running the job in order to make this possible.  I haven't tried it myself, but I imagine it would be possible to use the unix 'screen' utility to make this possible.  However, I would recommend getting more familiar with standard batch-job debugging techniques before trying something exotic like running every job under screen.  Being able to type commands into running jobs is a nifty thought, but batch jobs should be designed to run without interactive input, so it doesn't sound that useful in practice to me.

--Dan


On 7/18/11 4:44 PM, Sassy Natan wrote:
While goggling the different results for condor_ssh_to_job, I have found some interesting example on this page https://twiki.grid.iu.edu/bin/view/Engagement/HtmlVersion (see 9 Appendix: Monitoring a running job). In the example it shows about two interesting commands: glidein_ls and glidein_interactive. This is very cool, but as far as I know by a quick reading it is part of the glideinWMS
project. Is there anything like this in condor? I guess I could look in the command files (which are python based) to understand how this is working in glideinWMS and maybe try to convert them. But I guess if someway have a different ideas, please be my guest :-)

I have the feeling condor already have this, I just don't know how yet :-)

Sassy

 

On Mon, Jul 18, 2011 at 8:10 PM, Sassy Natan <sassyn@xxxxxxxxx> wrote:
Hi,

I'm running condor on Linux, with total of 200 slots in my pool.

When running a job, my users would like from time to time to interact with the running job.
So if for example they look in the job output file (stdout) and see some error, they would like to ssh the job and do some changes for the future input files (in the execute dir).
I manage to do ssh for the job, and even get a welcome screen that point me to the slot the job is running.
I also getting the PID of the process, but I don't know how to bind to the process.

If my process in the job.sub is a perl script, getting different args and also calling to different tools (like matlab, gcc etc...), how can I get into a mode that looks like I run the command from my console? where I can see the stdout tail on screen, and I can do CTRL+C to terminate the job? same as I do when using non-condor env?

The things is that if one of the tool get error, it get into a it's own shell, like for example in matlab, where I can provide or change some parameters and resume the run. However in a condor mode, this just get into the shell and I can not bind to it. The job  is running from a condor perspective, but as a matter of fact it's just in a idle mode, waiting for some input on the shell (In my case matlab, but there are some other tools as well). 

I tried to use gdb, but that seems to stuck my job. The minute I did that, the job log file seems to hang out. Until I did that it did printed a lot of info (I use the stream option). But once I used the gdb there was no more activity on the running machine.
I know the job is getting into a shell mode, since there are some error. If there is no error the job complete suspensefuly, but my users really like to debug the job if it get into this mode and not having to run from the beginning or outside condor.

Can someone please provide an example? or a feedback?


Thanks
Sassu



_______________________________________________ Condor-users mailing list To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You can also unsubscribe by visiting https://lists.cs.wisc.edu/mailman/listinfo/condor-users The archives can be found at: https://lists.cs.wisc.edu/archive/condor-users/

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/



_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/




--
--- Get your facts first, then you can distort them as you please.--
_______________________________________________ Condor-users mailing list To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You can also unsubscribe by visiting https://lists.cs.wisc.edu/mailman/listinfo/condor-users The archives can be found at: https://lists.cs.wisc.edu/archive/condor-users/

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/




--
--- Get your facts first, then you can distort them as you please.--