[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Ovewriting Checkpoint platform
You might take a look at reworking the
is_valid_checkpoint_platform _expression_ to either ignore the SSE versions
if you're certain that none of your standard universe jobs use SSSE4 opcodes,
or allow lower SSSE version checkpoints to resume on higher versions.
As of 8.2.9 this is the default _expression_,
from page 212 of the manual:
IS_VALID_CHECKPOINT_PLATFORM = (((TARGET.JobUniverse
== 1) == FALSE) || ((MY.CheckpointPlatform =!= UNDEFINED) && ((TARGET.LastCheckpointPlatform
=?= MY.CheckpointPlatform) || (TARGET.NumCkpts == 0))))
The checkpoint platform attribute is
described as "opaque," so when parsing it you can't expect it
to work forever, but for your purposes at your site it may be what you
need. You'd replace the =?= with some other method of comparing the two
sides that will give you what you want.
For example, if the TARGET.LastCheckpointPlatform
is a substring of MY.CheckpointPlatform, that would allow ssse3 checkpoints
to resume on 4.1 and 4.2 machines.
||Michael V. Pelletier|
IT Program Execution
978.858.9681 (5-9681) NOTE NEW NUMBER
10/08/2015 12:03 PM
Ovewriting Checkpoint platform
Hi, I have a doubt about where to configure checkponting
I'm running htcondor 8.2.8 on a CentOS submit and
central manager server and several execute nodes.
Execute nodes are heterogeneus (different processors)
and calculate differently the checkpoint platform, if i run
condor_status -format "%s\n" checkpointplatform | sort | uniq
20 LINUX X86_64 2.6.x normal 0x2aaaaaaab000 ssse3
40 LINUX X86_64 2.6.x normal 0x2aaaaaaab000 ssse3 sse4_1
56 LINUX X86_64 2.6.x normal 0x2aaaaaaab000 ssse3 sse4_1 sse4_2
This leads to not executing idle jobs on avilable machines
because checkpoint platform is sligtly differnt.
Should I overwrite the CHECKPOINT_PLATFORM macro,
configure a checkpoint server or there is any other option??
Thanks in advance.
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with
You can also unsubscribe by visiting
The archives can be found at: