[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] First go at standard Universe checkpointing



On Mon, Jan 10, 2011 at 03:58:02PM +0000, Ian Cottam wrote:
> I'm trying to get my first Standard Universe job working, because of
> checkpointing (Condor 7.4.4 RedHat build).
> 
> I've replicated the problems I'm having by means of a tiny C program.
> Said problems being...
> - the check point file ends in .tmp and seems far too small; and
> - when I restart the test case it immediately seg faults.
> 
> Any ideas?
> 
> Test code looks like this----
> #include <stdio.h>
> #include <math.h>
> int main(void)
> {
>  int i; double x;
>  FILE *f= fopen("r-out.txt", "w");
>  fputs("hello Condor Standard Universe - starting\n", f);
>  for (i= -500000000; i != 500000000; ++i) {
>   x= sqrt(i<0?-i:i); /* kill some time */ 
>  }
>  fputs("finished OK\n", f);
>  return 0;
> }
> 
> 
> ---
> 
> Compiled with---
> condor_compile gcc cctest.c -o cctest -lm
> ---
> I'm testing by just doing a standalone
> ./cctest
> 
> Followed by a control-Z to make it checkpoint and quit.

This is due to address space randomization and the virtual memory
layout (placement of the VDSO).  Run it like this:

setarch i386 -R -L ./cctest


And restart it like this:

setarch i386 -R -L ./cctest -_condor_restart cctest.ckpt


This assumes you are running 32-bit, but I would bet you are.

-- 
Dan