[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Docker Universe jobs failing because of a file transfer problem



On 4/12/23 12:35, Gaetan Geffroy wrote:

Hi,

 

This issue appears both when using the htcondor/mini image or the trio htcondor/cm â htcondor/submit â htcondor/execute images.

What I do is the following:

  1. Launch the docker containers using the images. If using htcondor/mini, I use the host network and mount /var/run/docker.sock in it.
    If using the other images, I connect them to a docker network (condor-network) that I created beforehand. I also mount /var/run/docker.sock to the execute nodes.
  2. In the relevant container, I run chmod 666 /var/run/docker.sock, then condor_restart
  3. I use condor_status slot1@xxxxxxxxxxxxxxxxx -json | grep Has to check the presence of the âHasDockerâ property
  4. I submit the following job:
    universe                              = docker
    docker_image                   = python:3.8.10
    should_transfer_files      = yes
    executable                         = /usr/bin/python
    arguments                          = test.py
    transfer_input_files         = test.py
    output                                 = test_docker.out
    error                                   = test_docker.err
    log                                       = test_docker.log
    initial_dir                           = /tmp
    queue 1


Hi Gaetan:

I think there's a bug with docker universe where it requires file transfer to be on.  Try adding the following to the submit file:

should_transfer_files = yes

when_to_transfer_output = on_exit


-greg