[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] condor_suspend/continue vs. condor_hold/release



On 5/27/2013 4:53 PM, 钱晓明 wrote:
Hi, What is the difference between condor_suspend/continue and
condor_hold/release for vanilla jobs?
I see this for condor_hold in manual: "A currently running job that is
placed in the hold state by condor_hold is sent a hard kill signal." So
I think that this job will be killed and in HOLD state. What
condor_suspend do to a running job?



condor_hold kills the process(es) currently associated with the job, frees the machine slot that was running the job to run another job.

condor_suspend does NOT kill the process(es) associated with the job, but instead tells the operating system to not schedule any CPU time for the job (in Unix-land, this means sending a SIGSTOP signal to the job). The job is still keeps the machine slot occupied - it is still consuming RAM (or at least swap), kernel resources like file descriptors, and disk. But it is not consuming any CPU cycles until a condor_continue.

Weak analogy to playing a DVD of a movie: Think of condor_suspend like free-framing the playback of a movie DVD, while condor_hold is like ejecting the DVD and putting it back on a shelf to watch another day.

regards
Todd