[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Problem with Condor standalone library



Thanks Daniel for all your help. I have emailed the problem to condor-admin.. Hopefully they can put some light on it.
In the mean time if somebody else knows something, I will really appreciate any help. And if I get around the problem,
I will share the solution.

-- Tan

On Sat, Mar 22, 2008 at 3:24 PM, Daniel Forrest <forrest@xxxxxxxxxxxxx> wrote:
Hi Tan,

> Thanks for your prompt reply. The debug output is as follows. Looks
> like it fails when closing the file.

I guess I'm stuck now.  This is X86_64 and I haven't worked with that
before, but it is clear there are problems with the checkpoint code in
any case.

First, note the number of times "0xlx" is printed out instead of any
reasonable number.  Obviously someone mistyped "0x%lx" in a format
several times.  Second, the header seems to be 1056 bytes long instead
of 1024 bytes.  This can be seen from the first "Pos: 754720" compared
to 0xb8000 (Note that 0xb8000 is {0x70b000 - 0x653000}, this is also
what is printed in "prot=" since "length=" didn't consume an argument
because of the aforementioned format error).  It should be 1024 bytes
and the fact that it's not implies there are problems with variable
sizes in the checkpoint code.  I also don't understand why your stack
address is what it is, but I don't know enough about X86_64.  Finally,
I have no idea what could be causing the close to fail, but I haven't
looked at the Condor 7.0 code to see how it might have changed.

You're going to need someone from the Condor team to resolve this.

--
Daniel K. Forrest       Laboratory for Molecular and
forrest@xxxxxxxxxxxxx   Computational Genomics
(608) 262 - 9479        University of Wisconsin, Madison



--
Tanzima Zerin Islam