[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Advertising Docker image names, and adding an image name match requirement to Docker jobs



The illustrious Greg Thain said that advertising available Docker image names was "coming soon" at HTCondor Week...

...in 2015. (We still love ya, Greg!)

Is In the meantime, I cooked up a little STARTD_CRON job to accomplish this (tested in 8.6.13), as well as a job transform that adds the requested image name to the job's requirements expression:

if defined DOCKER
STARTD_CRON_JOBLIST = $(STARTD_CRON_JOBLIST) DockerImages
STARTD_CRON_DOCKERIMAGES_MODE           = Periodic
STARTD_CRON_DOCKERIMAGES_PERIOD         = 1h
STARTD_CRON_DOCKERIMAGES_RECONFIG_RERUN = True
STARTD_CRON_DOCKERIMAGES_EXECUTABLE     = /bin/bash
STARTD_CRON_DOCKERIMAGES_ARGS           = \
" -c ' \
    IMGS=$(DOLLAR)($(DOCKER) images --format ""{{.Repository}}:{{.Tag}}"" 2>/dev/null | \
    /usr/bin/awk ""function printimg(str) { \
            if( length(str) == 0) { return }; \
            m = match(str, \"":latest\$(DOLLAR)\""); \
            if(m != 0) { \
                printf(\""%s, %s, \"", substr(str, 1, m-1), str) \
            } else { \
                printf(\""%s, \"", str) \
            } \
        } ; \
        { \
            { \
                t = lastline; \
                lastline = \$(DOLLAR)0; \
                \$(DOLLAR)0 = t \
            }; \
            printimg(\$(DOLLAR)0) \
        } \
        END { printf lastline } "" \
    ); \
    if [ -n ""$(DOLLAR)IMGS"" ] ; then \
        echo DockerImages = \""$(DOLLAR)IMGS\""; \
    fi; \
    echo -- \
' "
endif

(Hopefully line-wrapping didn't mangle this.)

The ARGS for the startd-cron job contains an inline shell script which runs the "docker images" command and converts it with awk into a machine attribute called "DockerImages," a comma-delimited list in a ClassAd string. It expands any ":latest"-tagged images to add the unqualified name of the image as well. The awk "lastline" trick just prevents a trailing comma.

On a machine where the DOCKER macro is defined, the output looks like this:

DockerImages = "nvidia/cuda:10.0-base-ubuntu18.04, nvcr.io/nvidia/pytorch:18.11-py3, nvcr.io/nvidia/mxnet:18.11-py3, nvcr.io/nvidia/tensorrt:18.11-py3, nvcr.io/nvidia/digits:18.10, nvcr.io/nvidia/tensorflow:18.10-py3, nvcr.io/nvidia/caffe:18.10-py2, nvcr.io/nvidia/cuda, nvcr.io/nvidia/cuda:latest, nvcr.io/nvidia/caffe2:18.08-py3, nvcr.io/nvidia/theano:18.08, nvcr.io/nvidia/torch:18.08-py2, nvcr.io/nvidia/cntk:18.08-py3, ufoym/deepo:all-py36-jupyter"
--

This string can be used with the ClassAd function "stringListMember()" and its default separator to match the image name. The following job transform will add the image name match to the requirements expression for the job:

JOB_TRANSFORM_NAMES = $(JOB_TRANSFORM_NAMES) RequireDockerImage
JOB_TRANSFORM_RequireDockerImage = [ \
    Requirements = WantDocker && ! isUndefined(DockerImage); \
    copy_Requirements = "PreDockerImageRequirements"; \
    set_requirements = stringListMember(MY.DockerImage, TARGET.DockerImages) && PreDockerImageRequirements \
]

The job transform config needs to fall outside the "if defined DOCKER" block, since it needs to exist for submitters, not just Docker-equipped machines.

With this in place, you don't need to insure that all of the Docker machines in your pool have exactly the same list of images. One drawback, on the other hand, is that jobs which specify an image which no machine actually has (via a typo or what have you) will sit idle in the same way as a job requesting a zillion terabytes of memory, rather than being held and generating an error e-mail.

Michael V. Pelletier
Information Technology
Digital Transformation & Innovation
Integrated Defense Systems
Raytheon Company