Supposing the checkpoint scheme is triggered if
specified number of cycles has passed. The
functional module(OPAL Pseq?) is involved and if the
checkpointing condition is satisfied it makes synchronization request to
all the cores which are running the parallel benchmarks. Later after
receiving the synchronization request the running cores decide to
stop the simulation and ACK to the requestor. Finally the requestor get all
the ACK and it calls writeCheckpoint function to save the state. Am I right here
of considering the simulation of the checkpointing?
How can I use the OPAL and RUBY module to implement the
simulation?
It is clear that synchronization can be finished at
the program level by barrier function call, but I am interested in the timing of
the synchronization and checkpoint creation.
Some of you guys may have already done works on checkpointing
simulation, I am looking forward to your help.
Thanks in advance!
Regards,
shuchang
|