I haven't done much beyond dummy jobs with
checkpointing, so I can't really speculate on that 7.6.6 behavior.
Perhaps one of the devs will expand
on this, but based on the documentation it appears that because the CheckpointPlatform
is an "opaque" string, by default there's no parsing of it -
the only way a checkpoint will restart on a given machine is when the original
machine's string matches the new machine's exactly.
So even if, say, the original machine
is SSSE3 and thus the executable will run fine on both older and newer
platforms, the new machine won't be considered a valid checkpoint platform
because it's also including the ssse4_1 and ssse4_2 tags in its CheckpointPlatform
string - the old machine's CheckpointPlatform without those tags won't
be an exact match.
My "substring" example would
NOT allow newer machines' checkpoints to run on older platforms, but the
fix for that in your case at your site is just to compare only the parts
of the CheckpointPlatform strings before they start listing the SSSE versions
- that is, just the first four fields (at least for this version of HTCondor).
It seems, though, that if your executables
WERE using SSSE opcodes, you'd want to add "TARGET.has_ssse4_2"
or what have you to your requirements _expression_, and just completely ignore
the SSSE pieces of the CheckpointPlatform anyway.
Michael V. Pelletier
IT Program Execution
978.858.9681 (5-9681) NOTE NEW NUMBER