[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] "incremental" (singularity) jobs



Hi,

I cannot find a straightforward solution for the following problem, and
I would be glad if someone could put me on the right track on how to do
it, or how to reframe the problem.

We have jobs to process that cover a wide range of data processing. They
all have in common that specific code/applications come in singularity
images that are provided by 3rd-parties. To perform the computations,
data need to be pulled from a data management system at the beginning
and results need to be put back into it at the end. The execute nodes do
not have the required data management software, though. Given that the
core processing is done via singularity, it would be easy to provide the
data management software via such an image as well. However, it would be
very difficult to fold it into all the individual singularity images
provided by 3rd-parties.

Q: Is it possible to bind three singularity jobs (each with its own
singularity image) together, such that they run on any, but the exact
same machine, and that they all share a common temporary work dir (the
execute nodes have no shared filesystem). The shared work dir is
important, as the size of the dataset is substantial (>x*100GB) and
moving the job results between prep, computation and finalize stages
would lead to substantial stress on the network, while the final results
tend to be rather small.

I'd be happy for any suggestions. Thanks!

Michael