[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Duplicate mount points for docker jobs



Dear all,

I remember seeing this happen in the past (v9.5.0?) whenever the execute node was the same as the submit node (and with non-shared filesystem). Workaround then was to make sure the submit node wasn't the docker execute node.

I didn't have the time then to investigate further, but the problem was replicable every time.

Kind regards,
Marco



On 28/03/2022 19:01, Jacek Kominek via HTCondor-users wrote:
Hi,

I am trying to run a docker job, but I keep getting the following shadow exception:

"Error response from daemon: Duplicate mount point: /var/lib/condor/execute/dir_13598"

This image has run in the past without problems, so I am a bit puzzled as to why it would error out now.

It was rebuilt recently, without any code changes (just a scheduled refresh) and the first time I tried to run it I did get a shadow exception which held the job as well:

"Unable to find image 'XXX:latest' locally. Pulling from...."

I checked on the execute node and it did pull the correct image and it can be found on the machine by manually running `docker image ls`.

Any ideas would be helpful, thank you in advance!

Best,

-Jacek