[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Caching large executable on worker nodes



On 08/11/2015 02:21 PM, Jens Schmaler wrote:

> In my case, the executable is the same for all jobs in one cluster, but
> it is different for each cluster. So a separate class ad does not really
> seem to be a viable solution, right?

Ah, I see -- I missed the "different for each cluster" bit.

One problem with external file transfer mechanisms is the race condition
between when the transfer finishes and when the job starts. So having
the transfer as parent node in the dag is a way around that. But then
you have to lay it out so the transfer is once per machine and in your
case also once per cluster.

You can probably make cluster n+1 transfer and jobs not start until
cluster n transfer's finished, too, by making it a child of transfer n.

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu

Attachment: signature.asc
Description: OpenPGP digital signature