[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Linux Containers to consider with HTCondor in the near future?



Hi Oliver,

I think we were perhaps talking a bit at different levels --

What I was saying is that the user should describe what the job needs.

If the job simply says what it needs, then how the job is executed -- forking processes, containers, VMs, or binary translation to an IBM mainframe -- becomes a behind-the-scenes detail.

With respect to Docker vs Singularity vs OCI - I'm not sure there's any particular update beyond the details you lay out below.

Brian

On May 21, 2019, at 6:00 PM, Oliver Freyermuth <freyermuth@xxxxxxxxxxxxxxxxxx> wrote:

Dear Brian,

Am 17.05.19 um 14:50 schrieb Bockelman, Brian:
Hi Max,

I find it better if users "tell HTCondor what they need, not how to do it".

I don't think this is what Max meant - at least it's not what I was asking for when I asked for OCI support ;-). 


That is, the users are better served if they are asked to put in a submit file line like:

+RequiredOS = "rhel7"

That's (almost) exactly what we are using right now with Singularity. My question for OCI support was _not_ for a new universe, but I was asking for having HTCondor support OCI "behind the scenes"
to allow for a more free choice of the container runtime. That means that I (as administrator) would love to select a different implementation without the users seeing anything of that - 
exactly what you suggest :-). 

So what's "bad" about Singularity and Docker, and why would I love to use something else? Of course, that really depends on your usecase. 

From my point of view, there are four main issues with Singularty:
- Interactive jobs only work with a hack in 8.6, and don't really work yet in 8.8 - but Greg's on it, so that's (hopefully) a temporary thing. 
 In my opinion, though, if you require interactive jobs, things are still not in reasonably good shape just yet for Singularity. 
 And with the "users should choose what they need" approach, which I also love, you definitely _need_ interactive jobs
 to give the users the environment they ask for - for interactive development, testing, and to have the very same environment their batch job will see. 
- The rate at which CVEs appear is still pretty high. That's expected, the project is still young - and the developer community
 is (still, even with Sylabs!) small as compared to other full-fledged off-the-shelf solutions. 
- There are still breaking changes introduced - regularly. 
- There's no distro-maintained version of any recent release available yet (and following discussion on distro bug trackers, it's clear there are reasons for that,
 and they don't speak for good quality of upstream's code). 

What are the issues with Docker? 
Of course, the "Docker universe" may be seen as an issue, since it's indeed not having the users tell HTCondor "what they need", but rather "how to do it". 
But this could be hidden from the users. 
My main issues with Docker are:
- Daemon running as root (but CVEs have become really rare, so maybe I am just too anxious / conservative here). 
- The way images are stored does not fit our usecase very well. 
- The overhead is a bit heavy by default (network virtualisation really kills MPI performance by design, etc.). 
All these things are reasons why Singularity, Charliecloud, runc, podman and more were born. 

So why am I asking for OCI? 
There are mainly two, strong reasons:
- Freedom of choice of implementation! 
 I'd love to be able to use runc / podman. It's shipped with RHEL 8 and will surely receive good support by them - if RedHat does take care of something,
 they do it with a lot of manpower. They even have CRIU integration,
 and while that might now work perfectly well, it's still better than nothing (just look at https://github.com/sylabs/singularity/issues/468 which is over 2 years old). 
 I'd love to use these with user namespace support, of course. 
- Hopefully, less work for the HTCondor developers, since there can (hopefully) be a common implementation on the HTCondor end for most (all?) OCI implementations,
 maybe with a few quirks / knobs to be tweaked for each runtime implementation, but not one implementation in HTCondor for each single container runtime as it is done right now. 

I strongly believe the HTCondor devs agree with my last point. That was also why https://github.com/htcondor/htcondor/pull/18 was closed - after discussion with Greg, I agreed that adding
another container implementation (even one that's hidden from the user!) is a maintenance burden which is not really worth it. It's better to have the HTCondor devs invest their energy
in developing OCI support in the future. 

I hope this explains it and you have made it through my long mail ;-). 

Cheers and all the best,
Oliver


(what they need!) as opposed to asking them to learn the Docker universe:

universe = docker
docker_image = myrepo:custom_centos7

(how to do it!)

This way, you have some ability to decide on switching the implementation later without having to worry about asking users to change their submit files.

That said, both Singularity and Docker support are in reasonably good shape.  I happen to use both, depending on the application.

Brian

On May 17, 2019, at 4:48 AM, Fischer, Max (SCC) <max.fischer@xxxxxxx> wrote:

Hi all,

we are currently transitioning out HTC farm from RHEL6 to RHEL7. As we should probably have expected, every VO/user realises in the last minute that they are *not* quite ready to make the transition. So for our next revision, we would like to add containerisation to have per-VO/user environments.

What are our options to consider for container with HTCondor in the near/medium future? Say about a year from now on?
We've already used HTC with both docker and singularity; each had it quirks to which we would like not to lock us in on. I've seen that Oliver asked for OCI support about a year ago, but I cannot find any current information on this.

Cheers,
Max

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/



-- 
Oliver Freyermuth
UniversitÃt Bonn
Physikalisches Institut, Raum 1.047
NuÃallee 12
53115 Bonn
--
Tel.: +49 228 73 2367
Fax:  +49 228 73 7869
--