[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Docker Job on Windows 10



HTCondor cannot currently make use of Docker on Windows 10 because HTCondor needs volume mounts to work from the Windows native file system into the docker container. 

 

The current implementation of Docker on Windows does not support volume mounts from the windows file system into the container when HTCondor is running as a service.   Volume mounts currently only work when the docker container is started by a logged in user that has Docker Desktop running and Docker Desktop is pre-configured to treat an SMB share as a volume mount.

 

Even if there were a way to use Volume mounts from the services desktop,  there is still a problem with using them under HTCondor –

 

HTCondor assumes that the path to the job execute directory that the HTCondor daemons will use is the same as the path that the job inside the docker container will see – and this is not the case when the Docker image is Linux and the native file system is Windows.

 

We have a plan to fix the second problem, hopefully sometime in the 8.9 development cycle, but we are the mercy of Microsoft or Docker to fix the first problem.

 

-tj

 

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Werner Koppelstätter
Sent: Thursday, January 9, 2020 10:02 AM
To: htcondor-users@xxxxxxxxxxx
Subject: [HTCondor-users] Docker Job on Windows 10

 

Hello,

 

Does anyone have experience with Docker on Windows 10?

 

I currently testing to run Docker-job on Windows 10.

I installed Docker and use it with Linux container.

CondorVersion: 8.9.4

DockerVersion: 2.1.0.4

I add the condor-slot users to the docker-users group.

 

After the machine starts the job the job changed back do submitted. The "hasDocker" atttribute of the machine changed from true to false.

In the StarterLog.slot1 I found following entries:

 

01/09/20 15:46:57 (pid:15212) About to exec docker:java
01/09/20 15:46:57 (pid:15212) Found 1 entries in docker image cache.
01/09/20 15:46:57 (pid:15212) Failed to get userid to run docker job
01/09/20 15:46:57 (pid:15212) DockerAPI::createContainer( python:slim3.7-openjdk13, java, ... ) failed with return value -9
01/09/20 15:46:57 (pid:15212) Failed to start job, exiting

 

 

I already test the job by running with docker without HTCondor on that machine.

I also test the docker-job via HTCondor with a Linux machine instead the Windows 10.

Local and with Linux the job runs.

 

 

Does anyone have any idea what I'm doing wrong?

 

Regards,

Werner