[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Condor in a Container works! (was Re: Trouble trying to make HTCondor work in a Docker container)



Jim - 

Mesos is the elastic scale out solution for kube in the on-prem case.  

Otherwise I saw some demo's where you can define policies to auto-scale using googles cluster manager tools that wrap the 'container engine'.  From what I saw it was quite snazzy, you could scale up and down based on a slew of variables.

If you have more questions, feel free to hit-up the GCE folks on freenode #google-containers. 

Cheers, 
Tim


From: "Jim White" <jimwhite@xxxxxx>
To: "HTCondor-Users Mail List" <htcondor-users@xxxxxxxxxxx>
Sent: Monday, November 10, 2014 3:04:01 PM
Subject: [HTCondor-users] Condor in a Container works! (was Re: Trouble trying to make HTCondor work in a Docker container)

Hi guys!  Thanks for the interest.  I did get it working and will write up a little README and a blog post later today.  My current post at http://jimwhite.github.io/ is about running the BLLIP parser and it's Python GUI in Docker.

The current config runs without privileged (because I want this to be a vanilla Kubernetes thing) but I also want to be able to use Docker images as executables for jobs that will be required for Docker-in-Docker.  

My new unexpected challenge is that the reason I hadn't seen how to change the number of Kube minions (cloud instances as opposed to pod replicas) dynamically is that that is not currently supported (at least on GCE).  I may do something based on the newly released GCE auto-scaler or perhaps the new Google Kubernetes PaaS will do what I need.  

I've been looking around at Condor auto-scaling solutions for whether there is any existing code that I could use for adjusting the replica controller but haven't seen anything that seems better than writing it from scratch.  Any suggestions?

Jim

On Mon, Nov 10, 2014 at 10:33 AM, Tim St Clair <tstclair@xxxxxxxxxx> wrote:
Hey folks -

For this use case, you're likely going to need to disable all cgroup isolation on condor when running, lest you try to re-parent and I don't even recommend trying to do that, because you may enter into a hurt-locker.  Also your containers may need to be privileged to run, see https://github.com/GoogleCloudPlatform/kubernetes/issues/391 for more details.

Best of luck,
Tim

----- Original Message -----
> From: "Todd Tannenbaum" <tannenba@xxxxxxxxxxx>
> To: "HTCondor-Users Mail List" <htcondor-users@xxxxxxxxxxx>
> Sent: Monday, November 10, 2014 12:15:14 PM
> Subject: Re: [HTCondor-users] Trouble trying to make HTCondor work in a Docker container
>
> On 11/9/2014 7:05 AM, Jim White wrote:
> > Adding UPDATE_COLLECTOR_WITH_TCP=TRUE gets the workers to show up in the
> > pool and allowing everything for hosts in the private IP space lets jobs
> > run.  I'll have a blog post within a few days to show this running in
> > Google Compute Engine.
> >
> > Jim
> >
>
> Jim, this sounds great, please share the blog post once written!
>
> Thanks
> Todd
>
>
>
>
> > On Sun, Nov 9, 2014 at 2:14 AM, Jim White <jimwhite@xxxxxx
> > <mailto:jimwhite@xxxxxx>> wrote:
> >
> >     Well, I've gotten past that issue by enabling DNS and (after dealing
> >     with Docker not permitting limit change operations) I can almost see
> >     daylight.  Submit/execute on the central manager host works, and the
> >     workers can see the queue on the central manager but they never join
> >     the pool, neither to jobs submitted at a worker ever get matched.
> >     The logs on the central manager never show any connections or
> >     requests from the worker's IP address (although there must be some
> >     since reads for condor_q and condor_status work).  So I figure this
> >     must be some ordinary Condor config mistake on my part or a
> >     complication due to the IP address mapping between containers and
> >     hosts but I'm surprised I can't find any error messages anywhere in
> >     the logs on either side.
> >
> >     My current condor_config.local looks something like this
> >     (CONDOR_HOST and DAEMON_LIST are set by calling condor_configure
> >     when the container runs):
> >
> >         ## Inside Docker we don't want to rely on DNS for user
> >         authentication.
> >
> >         TRUST_UID_DOMAIN = TRUE
> >         UID_DOMAIN = my-condor-pool
> >
> >         ## Use CCB so we don't need to deal with multiple ephemeral ports
> >         ## which are not yet supported by Docker.
> >
> >         USE_SHARED_PORT = True
> >         SHARED_PORT_ARGS = -p 9886
> >
> >         SEC_DEFAULT_NEGOTIATION = NEVER
> >         SEC_DEFAULT_AUTHENTICATION = NEVER
> >
> >         ## We're not gonna try and reconfigure for each host involved.
> >         ## Just rely on our private network.
> >         ALLOW_READ            = *,*@*
> >         ALLOW_WRITE           = *,*@*
> >         ALLOW_ADMINISTRATOR   = *,*@*
> >         ALLOW_CONFIG          = *,*@*
> >         ALLOW_NEGOTIATOR      = *,*@*
> >         ALLOW_DAEMON          = *,*@*
> >
> >         # This didn't seem to change the setting for the collector:
> >         # MAX_FILE_DESCRIPTORS=1024
> >         # Maybe DEFAULT_MAX_FILE_DESCRIPTORS?
> >         # The collector wants to allow at least 10240 open descriptors,
> >         # but Docker doesn't permit changing limits.
> >         COLLECTOR_MAX_FILE_DESCRIPTORS=1024
> >
> >         # Fiddling with these have had no effect so far...
> >         FLOCK_FROM=10.*
> >         FLOCK_TO=$(COLLECTOR_HOST)
> >         HOSTALLOW_READ=10.*
> >         HOSTALLOW_WRITE=10.*
> >
> >
> >     Jim
> >
> >
> >
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/
>

--
Cheers,
Timothy St. Clair
Red Hat Inc.
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/



--
Cheers,
Timothy St. Clair
Red Hat Inc.