[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Trouble trying to make HTCondor work in a Docker container



On 11/9/2014 7:05 AM, Jim White wrote:
Adding UPDATE_COLLECTOR_WITH_TCP=TRUE gets the workers to show up in the
pool and allowing everything for hosts in the private IP space lets jobs
run.  I'll have a blog post within a few days to show this running in
Google Compute Engine.

Jim


Jim, this sounds great, please share the blog post once written!

Thanks
Todd




On Sun, Nov 9, 2014 at 2:14 AM, Jim White <jimwhite@xxxxxx
<mailto:jimwhite@xxxxxx>> wrote:

    Well, I've gotten past that issue by enabling DNS and (after dealing
    with Docker not permitting limit change operations) I can almost see
    daylight.  Submit/execute on the central manager host works, and the
    workers can see the queue on the central manager but they never join
    the pool, neither to jobs submitted at a worker ever get matched.
    The logs on the central manager never show any connections or
    requests from the worker's IP address (although there must be some
    since reads for condor_q and condor_status work).  So I figure this
    must be some ordinary Condor config mistake on my part or a
    complication due to the IP address mapping between containers and
    hosts but I'm surprised I can't find any error messages anywhere in
    the logs on either side.

    My current condor_config.local looks something like this
    (CONDOR_HOST and DAEMON_LIST are set by calling condor_configure
    when the container runs):

        ## Inside Docker we don't want to rely on DNS for user
        authentication.

        TRUST_UID_DOMAIN = TRUE
        UID_DOMAIN = my-condor-pool

        ## Use CCB so we don't need to deal with multiple ephemeral ports
        ## which are not yet supported by Docker.

        USE_SHARED_PORT = True
        SHARED_PORT_ARGS = -p 9886

        SEC_DEFAULT_NEGOTIATION = NEVER
        SEC_DEFAULT_AUTHENTICATION = NEVER

        ## We're not gonna try and reconfigure for each host involved.
        ## Just rely on our private network.
        ALLOW_READ            = *,*@*
        ALLOW_WRITE           = *,*@*
        ALLOW_ADMINISTRATOR   = *,*@*
        ALLOW_CONFIG          = *,*@*
        ALLOW_NEGOTIATOR      = *,*@*
        ALLOW_DAEMON          = *,*@*

        # This didn't seem to change the setting for the collector:
        # MAX_FILE_DESCRIPTORS=1024
        # Maybe DEFAULT_MAX_FILE_DESCRIPTORS?
        # The collector wants to allow at least 10240 open descriptors,
        # but Docker doesn't permit changing limits.
        COLLECTOR_MAX_FILE_DESCRIPTORS=1024

        # Fiddling with these have had no effect so far...
        FLOCK_FROM=10.*
        FLOCK_TO=$(COLLECTOR_HOST)
        HOSTALLOW_READ=10.*
        HOSTALLOW_WRITE=10.*


    Jim