[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] checkpoint size limit?



Currently this job is showing a size of 2500

 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
271.0 XXXXX 3/22 00:43 0+11:08:31 R 0 2500.0 lissom 051119_line


From my shadow log:

3/22 11:52:30 (271.0) (1981):Read: Done saving file state
3/22 11:52:30 (271.0) (1981):Read: About to update MyImage
3/22 11:52:30 (271.0) (1981):Read: Internal error, ckpt size calculated is -1993149441 3/22 11:52:30 (271.0) (1981):Shadow: Job 271.0 exited, termsig = 9, coredump = 0, retcode = 0 3/22 11:52:30 (271.0) (1981):Shadow: Job was kicked off without a checkpoint


Has my user gone past the limitations of Condor for IA32, or am I jumping to conclusions? Is x86_64 anywhere near release? I have a cluster full of Opterons waiting for it.

- dave