[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] A Problem while restarting a checkpoint file



Thank you Mark and Daniel for your replies. As Daniel added, I am submitting the jobs to vanilla
universe and taking care of checkpointing by myself. Ya.. I dont find any apparent reason why it is causing this problem.

-- Tan

On Fri, Mar 28, 2008 at 7:10 AM, Daniel Forrest <forrest@xxxxxxxxxxxxx> wrote:
Mark,

> From
> http://www.cs.wisc.edu/condor/manual/v6.8/1_4Current_Limitations.html,
> see point 4 of the section "*Limitations on Jobs which can Checkpointed"
>
> *4. Sending or receiving the SIGUSR2 or SIGTSTP signals is not allowed.
> Condor reserves these signals for its own use. Sending or receiving all
> other signals /is/ allowed.

Those limitations apply to jobs run in the Standard Universe.

What Tan is doing is running a condor_compile'd binary as a Vanilla
Universe job and using the standalone features of checkpointing.

There is no reason why this shouldn't work.

--
Daniel K. Forrest       Laboratory for Molecular and
forrest@xxxxxxxxxxxxx   Computational Genomics
(608) 262 - 9479        University of Wisconsin, Madison



--
--
Tanzima Zerin Islam
Graduate Student
School of Electrical & Computer Engineering
Purdue University