[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] 7.0.3 Debian 4 dynamic clipped?



Hi Daniel,

> On Wed, Jul 23, 2008 at 11:55:11PM +0100, Richard Palmer wrote:
> 
> > Remote syscalls seem to work (well, a simple 'fread' does anyway).
> > 
> > Checkpoint claimed to work, but when I try standalone checkpointing with
> > a simple sleep() programme it fails with: 
> > 
> > condor-vm-3:/home/condor$ ./simple.std  -_condor_D_ALL -_condor_restart simple.std.ckpt
> 
> You really want to run it like this:
> 
> setarch x86_64 -L -R ./simple.std -_condor_D_ALL
> 
> And then again to restart:
> 
> setarch x86_64 -L -R ./simple.std -_condor_D_ALL -_condor_restart simple.std.ckpt
> 
> Otherwise the address layout changes will kill you.

Ah, good point. I now get upto:

condor-vm-3:/home/condor$ ./simple.std -_condor_D_ALL -_condor_restart
simple.std.ckpt 
User Job - $CondorPlatform: X86_64-LINUX_DEBIAN40 $
User Job - $CondorVersion: 7.1.1 Jul 23 2008 $
Condor: Notice: Will restart from simple.std.ckpt
Read headers OK
Read SegMap[0](DATA) OK
Read SegMap[1](STACK) OK
Read all SegMaps OK
Restoring a DATA segment
Found a DATA block, increasing heap from 0x6b2000 to 0x6b3000
About to overwrite 696320 bytes starting at 0x609000(DATA)
About to execute on TmpStk
About to execute on tmpstack.
Beginning Execution on TmpStack.
RestoreStack() Entrance!
Restoring a STACK segment
About to overwrite 40959 bytes starting at 0x7fff1fed7000(STACK)
in Segmap::Read(): fd = 3, read_size=40959
Segmentation fault

Stack trace gives:

Core was generated by `./simple.std -_condor_D_ALL -_condor_restart
simple.std.ckpt'.
Program terminated with signal 11, Segmentation fault.
#0  0x00000000004633b6 in getenv ()
(gdb) bt
#0  0x00000000004633b6 in getenv ()
#1  0x000000000045f14a in __dcigettext ()
#2  0x000000000047f9b8 in strerror_r ()
#3  0x000000000047f81e in strerror ()
#4  0x0000000000403ec5 in SegMap::Read ()
#5  0x000000000040498d in Image::RestoreSeg ()
#6  0x0000000000404a4f in RestoreStack ()
#7  0x0000000000405d94 in ExecuteOnTmpStk ()
#8  0x0000000000404b8e in Image::Restore ()
#9  0x0000000000404bf8 in restart ()
#10 0x0000000000400ca0 in MAIN ()
#11 0x000000000045dd08 in __libc_start_main ()
#12 0x00000000004001ba in _start () at ../sysdeps/x86_64/elf/start.S:113

setarch doesn't seem to exist on Debian, what is it doing ?.

regards,

Richard.

-- 
Richard Palmer
Systems Administration Officer / Centre for E-Research
King's College London          / Centre for Computing in the Humanities
Tel: 0207 848 1973