[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Problem with Condor standalone library



Daniel,
 Thanks for your prompt reply. The debug output is as follows. Looks like it fails when closing the file.

$ setarch i386 -R ./helloWorld -_condor_D_ALL
User Job - $CondorPlatform: X86_64-LINUX_RHEL3 $
User Job - $CondorVersion: 7.0.0 Jan 22 2008 BuildID: 72173 $
Condor: Notice: Will checkpoint to ./helloWorld.ckpt
Condor: Notice: Remote system calls disabled.
1
2
3
4
5
6
7
8
9
10
Got SIGTSTP
Saved signal state.
About to save file state
CondorFileTable::checkpoint

OPEN FILE TABLE:
fd 0
        logical name: default stdin
        offset:       0
        dups:         1
        open flags:   0x0
        not currently bound to a url.
fd 1
        logical name: default stdout
        offset:       84
        dups:         1
        open flags:   0x1
        url:          fd:1
        size:         84
        opens:        1
fd 2
        logical name: default stderr
        offset:       0
        dups:         1
        open flags:   0x1
        not currently bound to a url.
working dir = /autohome/u102/tislam/helloWorld
Done saving file state
About to update MyImage
Adding a DATA segment: start[0xlx], end [0xlx]
Image::AddSegment: name=[DATA], start=[653000], end=[70b000], length=[0xlx], prot=[0xb8000]
Adding a STACK segment: start[0xlx], end [0xlx]
Image::AddSegment: name=[STACK], start=[7fbfff6000], end=[7fbfffffff], length=[0xlx], prot=[0x9fff]
Pos: 754720
Pos: 795679
Size of ckpt image = 795679 bytes
About to write checkpoint
Image::Write(): fd -1 file_name ./helloWorld.ckpt
Checkpoint name is "./helloWorld.ckpt"
Tmp name is "./helloWorld.ckpt.tmp"
Wrote headers OK
Wrote all SegMaps OK
write(fd=3,core_loc=0xlx,len=0xlx)
I wrote 753664 bytes with write...
Wrote Segment[0] of type DATA -> OK
write(fd=3,core_loc=0xlx,len=0xlx)
I wrote 40959 bytes with write...
Wrote Segment[1] of type STACK -> OK
Wrote all Segments OK
About to close ckpt fd (3)
Close failed!
Ckpt exit
Write failed with [-1]
Killed


-- Tan
On Sat, Mar 22, 2008 at 2:13 PM, Daniel Forrest <forrest@xxxxxxxxxxxxx> wrote:
Tan,

> Thank you for your suggestion. Well, I just tried now with setting
> arch, but still it takes myapp.ckpt.tmp. So that means the checkpoint
> is not successful.

Please try this:

setarch i386 -R ./myapp -_condor_D_ALL

And reply with the debug output.

--
Daniel K. Forrest       Laboratory for Molecular and
forrest@xxxxxxxxxxxxx   Computational Genomics
(608) 262 - 9479        University of Wisconsin, Madison



--
Tanzima Zerin Islam