Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] "incremental" (singularity) jobs

Date: Sun, 19 Aug 2018 10:04:19 -0700
From: Philippe Grassia <pgrassia@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] "incremental" (singularity) jobs

Hi

I do not have an all encompassing solution but I have given this type ofproblem some thoughts already

the whole point of a container solution (including but not limited to)is to isolate the processes in the container from the the rest of theworldÂ (host / other containers) so from the container host (in thiscase condor execute node) this is an inter process communicationproblem. the options then are


- pipe from data management to singularity

submit file could look like

executable = sh

arguments = -c "process_on_the_execute_node fetch data | singularityexec $MY_SINGULARITY_EXEC_OPTIONS my_script_inside_the_container |post_processing_on the execute_node"

- use a socket or a fifo declared on the host bound the singularityimage : data management does its thing writes to the socket or fifo ,the processes inside the container just read from there oblivious to thefact that it is also handled from outside the container

since data_management and the container processes run in parallel thiscould probably be a (dynamic ?) DAG

- bind a common directory from the host in the containerÂ and read andwrite files (this will lead to concurrency concerns betweendata_management and container)

- shared memory. I believe it is possible but I think the config on thehost would be way too contrapted to be useful at the scale of a condorcluster (using /dev/shm is actually previous scenario from a functionalstandpoint)

That being said do not forget thatÂ it is possible to subclasssingularity images for your own benefit, use recipes.


http://singularity.lbl.gov/docs-recipes

https://www.sylabs.io/guides/2.5.1/user-guide/container_recipes.html

if your data management client is not too convoluted that's the route Iwould personally investigate with a series of recipes looking like


Bootstrap: shub
From: my_3rd_party_image


%help
adding "data management" to my-3rd_party_image

# Both of the below are copied before %post

# 1. This is how to copy files for legacy < 2.3


%setup
ÂÂÂ add_my_data_management_repository
ÂÂÂ apt update
ÂÂÂ apt install my_data_management

%files
ÂÂÂ copy_configuration_to_make_my_data_management_client_useful

depending on the amount of 3rd party images and the complexity of theinstallation of the data_management client, this may or may not be aviable option.


HTH
Philippe



On 08/19/2018 12:18 AM, Michael Hanke wrote:

Hi,

I cannot find a straightforward solution for the following problem, and
I would be glad if someone could put me on the right track on how to do
it, or how to reframe the problem.

We have jobs to process that cover a wide range of data processing. They
all have in common that specific code/applications come in singularity
images that are provided by 3rd-parties. To perform the computations,
data need to be pulled from a data management system at the beginning
and results need to be put back into it at the end. The execute nodes do
not have the required data management software, though. Given that the
core processing is done via singularity, it would be easy to provide the
data management software via such an image as well. However, it would be
very difficult to fold it into all the individual singularity images
provided by 3rd-parties.

Q: Is it possible to bind three singularity jobs (each with its own
singularity image) together, such that they run on any, but the exact
same machine, and that they all share a common temporary work dir (the
execute nodes have no shared filesystem). The shared work dir is
important, as the size of the dataset is substantial (>x*100GB) and
moving the job results between prep, computation and finalize stages
would lead to substantial stress on the network, while the final results
tend to be rather small.

I'd be happy for any suggestions. Thanks!

Michael

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

References:
- [HTCondor-users] "incremental" (singularity) jobs
  - From: Michael Hanke

Prev by Date: [HTCondor-users] "incremental" (singularity) jobs
Next by Date: Re: [HTCondor-users] "incremental" (singularity) jobs
Previous by thread: [HTCondor-users] "incremental" (singularity) jobs
Next by thread: Re: [HTCondor-users] "incremental" (singularity) jobs
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

Re: [HTCondor-users] "incremental" (singularity) jobs