[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] error using checkpointing



Daniel Forrest ha scritto:
Roberto Nunnari wrote:

I'm new to condor and to checkpointing, but we have a small cluster
here, and I'd like to introduce checkpointing..

As queueing system, we use SGE, and at present we don't plan to
change that.

So, I'm testing condor checkpointing, but whatever I do, I always get
errors and the .ckpt file never gets created, but just the .ckpt.tmp

To test it, I use a simple program that prints a counter and then
nanosleep() for 1 second.


first, I get warnings during compilation:

Don't worry about those, they are normal.

Then, here's the run session, interrupted with SIGTSTP:

$ ./blah3 -_condor_D_ALL

You are being hit by address space randomization.

Sigh, this should really be spelled out BOLDLY in the Condor manual.

When you are testing standalone checkpointing, you need to disable
address space randomization.  Like this:

$ setarch x86_64 -R -L ./blah3 -_condor_D_ALL

You don't really need "-L" for x86_64, but you do for i386, so I
always mention both "-R" and "-L" in case a 32-bit user stumbles
across this in the archives.

Yes! That works very well! Thank you very much Dan.
Best regards.
Robi