[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] using idle computers in computer labs for CFD jobs
- Date: Mon, 19 Oct 2015 15:20:45 +0000
- From: Ian Cottam <Ian.Cottam@xxxxxxxxxxxxxxxx>
- Subject: Re: [HTCondor-users] using idle computers in computer labs for CFD jobs
On 19/10/2015 03:43, "HTCondor-users on behalf of David Herd"
<htcondor-users-bounces@xxxxxxxxxxx on behalf of d.herd@xxxxxxxxxxx> wrote:
>The jobs we run are mostly CFD and use Ansys. As such we canât link them
>HT Condor modules and it looks like we wonât be able to take checkpoints
>of our jobs.
We just encourage our users to write their own checkpointing code under
the vanilla universe. We also have templates for e.g. C and MATLAB.
Basically, you have to check on startup for the existence of a checkpoint
file and if present start the computation from the point its contents
define; and then also periodically update it (or update it on evict).
Condor handles all the rest.
The very latest Condor (which we don't run) has a little more help for
vanilla checkpointing, but it doesn't save the user a lot of code
(basically you could just do the file write on evict bit, I think). If
nearly all your users run Ansys, you could likely figure out a template
for checkpointing that they could all copy.
Ian Cottam | IT Relationship Manager | IT Services | C38 Sackville
Street Building | The University of Manchester | M13 9PL |