[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Checkpointing failed on X86_64



I compiled this simple program with condor_compile gcc -o count count.c

#include <stdio.h>
#include <unistd.h>

int main(int argc, char *argv[]) {
   int Counter = 0;
   int i;

   if (argc < 2) {
      printf("count num_seconds\n");
      return 0;
   }

   Counter = atoi(argv[1]);
   for (i=0; i<Counter; i++) {
      printf("%d\n", i); fflush(stdout);
      sleep(1);
   }

   return 0;
}

When I used condor_hold while the program was running I got this error 
in the log file:

001 (008.000.000) 11/17 19:13:25 Job executing on host: 
<10.10.20.90:42208>
...
004 (008.000.000) 11/17 19:15:20 Job was evicted.
        (0) Job was not checkpointed.
                Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
                Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
        570  -  Run Bytes Sent By Job
        4754958  -  Run Bytes Received By Job

I looked for the manual 
http://www.cs.wisc.edu/condor/manual/v6.8/1_5Availability.html#sec:Availability

It appears condor_compile is not supported on my platform Fedora Core 
4/Opteron. Is this the real reason?

Thanks

Junjun

--
Dr. Junjun Mao, Research Associate
Steinman Hall, #1M-11
Levich Institute at City College of CUNY
140th Street & Convent Avenue
New York, NY 10031
(212) 650-6845 (Phone) 
(212) 650-6835 (fax)