[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] 32bit checkpoint servers and 64bit nodes?
- Date: Tue, 19 Feb 2008 10:22:37 -0600
- From: "David A. Kotz" <dkotz@xxxxxxxxxxxxx>
- Subject: [Condor-users] 32bit checkpoint servers and 64bit nodes?
I've recently converted a part of my Linux Condor pool to 64bit Ubuntu
and 64bit Condor. Both of my initial 64bit users have reported that
their jobs are failing to checkpoint and continually restarting. All of
the checkpoint servers in my pool are running 32bit Ubuntu and 32bit
Condor. Is there any known issue with this configuration? I'd assumed
that the checkpoint server just received a data stream and dumped it to
disk so that it wouldn't matter. The checkpoint servers in question are
doing checkpoints for other jobs.
The shadow log for the 64bit submit node has lines like these:
2/18 22:35:10 (765.0) (22524):store request to ckpt server failed,
trying again in 320 seconds