[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] checkpointing produces segfault
- Date: Tue, 28 Feb 2006 10:23:44 -0600
- From: Patrick Huber <phuber@xxxxxxxxxxxxxxxx>
- Subject: Re: [Condor-users] checkpointing produces segfault
thanks for that info. Just in case I need it, what would those
special requirements look like, since I am still running 6.7.14?
Do you run on a pool with both regular and "bigmem" kernels? You can't
checkpoint on a regular kernel and then on a bigmem kernel, or
vice versus (ie once you checkpoint on one you must checkpoint
on that same flavor for the the rest of your job). Same thing
moving between some 2.4 and 2.6 kernels.
The segfaults also happen even before the first time the job was
checkpointed, thus this can be at best a part of the problem. I
checked my code with valgrind and found indeed something which
may have caused a heap corruption. It seems that my problem is
gone for now...
Dr. Patrick Huber Physics Department
University of Wisconsin
Tel.:+1 608 262 2886 1150 University Avenue
http://pheno.physics.wisc.edu/~phuber Madison, WI 53706, USA