[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Conda env with workstation pool



Hi Matt,

Three ideas come to mind:

1.  We've seen sites wrap the HTCondor interaction in higher-level tools that users use instead of HTCondor commands:
   - Pro: Users need to learn very little -- only enough to invoke the high-level tool correctly.
   - Con: Users don't learn how to help themselves, can only support a limited number of workflows.

2.  The USER_JOB_WRAPPER, run at the startd, provides a powerful hook for customizing the job startup environment. Separately, note that Singularity can be used to start jobs inside containers you curate.
   - Pro: As you execute arbitrary code, you can do a number of transformations impossible to do at the schedd side.
   - Con: As far as users are concerned, this is outright magic.  They'll learn the HTCondor system more but may not be able to transfer the knowledge to other places without said magic.

3.  Perhaps you can just provide an easy way to generate the right environments and have users utilize HTCondor directly.  That is, write a wrapper on the submit side which takes a conda environment and produces a corresponding file (which is really a Singularity image - but no need to tell the user that) and provide explanation of how to run inside that file.  This way, the conda environment is always activated in the job - pure python.
   - You still end up doing some work to write the "conda-to-singularity" script but after that the users can still see all the pieces working together.

Brian

On Jul 27, 2020, at 12:42 PM, West Matthew <matthew.west@xxxxxxxx> wrote:

Hi All,

I want to stress something that I might not have made clear in previous messages.

I work with a lots of folks who have little experience with running software outside of a single machine. They have a need to scale up their computing effort in order to do a quality analysis but HTC is not why they got interested in the work. This is not to say a biologist or civil engineer cannot learn the tools to containerize their own software, but often incentives prioritizing immediate results make researchers hesitant to try new tools. 

- Why do I need all these other skills just to run software like it does on my laptop? 

For mature projects and more experienced developers, I believe using containers is an easy sell. But I hope y'all can understand the predicament of lowering the barrier to entry to working in an HTC framework for beginners.

Cheers,
Matt


From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Michael Pelletier via HTCondor-users <htcondor-users@xxxxxxxxxxx>
Sent: Monday, July 27, 2020 1:11:19 AM
To: HTCondor-Users Mail List
Cc: Michael Pelletier
Subject: Re: [HTCondor-users] Conda env with workstation pool
 
I set up a Singularity container to auto-activate a specified environment at startup, which works pretty well for my users. You might also be able to use the OS-native python to write a python-language wrapper that pulls in all the activation environment variables to its own environment, and then invokes the target Python script.

 

Michael V Pelletier
Principal Engineer

Raytheon Technologies
Information Technology
Digital Transormation & Innovation
 

 

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of West Matthew
Sent: Friday, July 24, 2020 7:48 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [External] Re: [HTCondor-users] Conda env with workstation pool

 

Hi Josh,

I am working from the CHTC recommendation using conda pack. I am curious if you know a way to activate the conda environment from within a python script? Having to wrap my python executable with a bash script is rather frustrating when I am trying to get away from multi-language situation.

I can create the directory and extract the contents of the tarball. I just need to activate the env.

Cheers,
Matt

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Josh Karpel via HTCondor-users <htcondor-users@xxxxxxxxxxx>
Sent: Thursday, July 23, 2020 4:41:59 PM
To: HTCondor-Users Mail List
Cc: Josh Karpel
Subject: Re: [HTCondor-users] Conda env with workstation pool

 

That pretty much lines up with what we tell people to do at CHTC: http://chtc.cs.wisc.edu/conda-installation.shtml

 

 

Josh Karpel

 

 

On Thu, Jul 23, 2020 at 8:25 AM Michael Pelletier via HTCondor-users <htcondor-users@xxxxxxxxxxx> wrote:
Iâve built a Singularity definition file that installs Miniconda and creates an environment YAML file, then builds the environment, and then configures the container so that the environment is activated automatically at the startup of the Singularity container. With Miniconda a CUDA 10.2 and 18.04 Ubuntu Singularity container file is about 850 megabytes in size.

 

Singularity doesnât necessarily have an extra infrastructure layer, as it doesnât require any services from the host â I bet it would be possible to input-transfer the Singularity executable and run it on an input-transferred container. 

 

Alternatively, you could build out the full virtualenv in a directory with Miniconda, and then input-transfer that whole directory and activate it when the job starts up, which would eliminate the need for modules to be available on the exec node. 

 

Michael V Pelletier
Principal Engineer

Raytheon Technologies
Information Technology
Digital Transormation & Innovation

 

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of West Matthew
Sent: Thursday, July 23, 2020 7:03 AM
To: 
htcondor-users@xxxxxxxxxxx
Subject: [External] [HTCondor-users] Conda env with workstation pool

 

I am trying to run an analysis on my local workstation pool that relies on software in a conda virtual environment. When the job runs on a remote machine, it does not have access to the libraries in that env back on the submit machine.

Given that virtual environments are common practice when running locally, I an hoping there is some means of making python libraries accessible without resorting to an extra infrastructure layer like Docker.

 

Cheers,
Matt
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/