[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Question about condor_suspend (small grid)



On 12/18/18 8:49 AM, Daniel Rosso wrote:
Hello guys, I am an electronic engineer and I created a very small grid to test HTCondor.

All is perfect to run some jobd and everything. But I have a issue when I am attempting to suspend jobs in one specific node.

I attach my tests:

$ condor_suspend --addr <192.168.8.25:37801>
-bash: syntax error near unexpected token `newline'


condor_suspend operates on a running job. It puts the job in the suspended state, where it is still occupies the slot on the worker node, but stops consuming cpu resources. It is different from condor_vacate_job, because condor_vacate_job removes the job from the node, marks it as idle, and allows it to restart from scratch again. condor_hold also removes the job from the worker node, but marks the job as "Held", and thus is not allowed to restart until a user releases it.


condor_suspend usually is run on the submit machine, and just takes a job id of a running job. So, if you want to suspend a job, find the job id with condor_q and run


condor_suspend 124.0


(replacing 124.0 with whatever the job or cluster id is).


-greg