Re: [HTCondor-users] using idle computers in computer labs for CFD jobs

On 19/10/2015 03:43, "HTCondor-users on behalf of David Herd"
wrote:

>The jobs we run are mostly CFD and use Ansys.  As such we canât link them
>HT Condor modules and it looks like we wonât be able to take checkpoints
>of our jobs.

We just encourage our users to write their own checkpointing code under
the vanilla universe. We also have templates for e.g. C and MATLAB.
Basically, you have to check on startup for the existence of a checkpoint
file and if present start the computation from the point its contents
define; and then also periodically update it (or update it on evict).
Condor handles all the rest.

The very latest Condor (which we don't run) has a little more help for
vanilla checkpointing, but it doesn't save the user a lot of code
(basically you could just do the file write on evict bit, I think). If
nearly all your users run Ansys, you could likely figure out a template
for checkpointing that they could all copy.


