[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Segfault during basic use of standalone checkpointing



Hi Lane,

I don't know if your problem is related (since you're using Centos, not Fedora), but somebody reported a segmentation fault here:
https://www-auth.cs.wisc.edu/lists/condor-users/2007-November/msg00176.shtml

[As an aside (1024)^3 = 1073741824 => too large for 32 bit. You could consider using long, including <limits.h> and using MAX_LONG
 as upper bound literal. Otherwise, there may be portability issues.]

Best,
Jochen
________________________________________
From: condor-users-bounces@xxxxxxxxxxx [condor-users-bounces@xxxxxxxxxxx] On Behalf Of Lane Schwartz [dowobeha@xxxxxxxxx]
Sent: Tuesday, March 08, 2011 8:03 PM
To: condor-users@xxxxxxxxxxx
Subject: [Condor-users] Segfault during basic use of standalone checkpointing

Hi, I'm new to condor. I just installed condor 7.4.4 on Centos 5.5,
and I'm trying to try out standalone checkpointing for the first time.
Unfortunately, I'm getting a segmentation fault when I try to restart
a program using a checkpoint file.

I've been following the instructions in section 4.2.1 of the manual
(http://www.cs.wisc.edu/condor/manual/v6.4/4_2Condor_s_Checkpoint.html).
Details are below:

I have a program called toy.c:

$ condor_compile gcc -o toy toy.c
LINKING FOR CONDOR ......(some more output).....

$ ./toy
...(TOY PROGRAM OUTPUT)....

(control-Z)
...(PROGRAM STOPS)...

$ ./toy -_condor_restart ./toy.ckpt.tmp
Condor: Notice: Will restart from ./toy.ckpt.tmp
Segmentation fault


My eventual goal is to use condor for transparent checkpointing of
jobs using SGE (Sun Grid Engine). But at the moment I can't even get
this toy standalone example to work. (For reference, the source for
toy.c is below)

If anyone has any tips or pointers, or links to good tutorials on the
use of standalone checkpointing, I'd be much obliged.

Thanks,
Lane


//toy.c
#include <stdio.h>

int main(int argc, char **argv) {

   int i;
   int n;

   n=1024*1024*1024;

   for (i=0; i<n; i+=1) {
      printf("We calculated: %d^2=%d\n", i, i*i);
   }

   return 0;
}
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/