[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Docker Universe jobs failing because of a file transfer problem



Hi Greg,

 

Sadly, adding âwhen_to_transfer_output = on_exitâ did not resolve the issue.

The âshould_transfer_files = yesâ option was already present.

 

I will precise here something that might not be clear in my original message: I tested this exact same job on an existing condor pool, working directly on a cluster of machines, and it worked as intended.

This issue only shows up when simulating a cluster of machines using different Docker containers. Only does the weird missing input file issue appears.

 

Thanks for the help,

 

GaÃtan

 


Gaetan Geffroy
Junior Software Engineer
Terma GmbH
T +49 6151 86005 43 (direct)
 


 

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Greg Thain via HTCondor-users
Sent: Thursday, April 13, 2023 17:00
To: htcondor-users@xxxxxxxxxxx
Cc: Greg Thain <gthain@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Docker Universe jobs failing because of a file transfer problem

 

CAUTION: This email originated from outside of Terma. Do not click links or open attachments unless you recognize the sender and know the content is safe.

On 4/12/23 12:35, Gaetan Geffroy wrote:

Hi,

 

This issue appears both when using the htcondor/mini image or the trio htcondor/cm â htcondor/submit â htcondor/execute images.

What I do is the following:

  1. Launch the docker containers using the images. If using htcondor/mini, I use the host network and mount /var/run/docker.sock in it.
    If using the other images, I connect them to a docker network (condor-network) that I created beforehand. I also mount /var/run/docker.sock to the execute nodes.
  2. In the relevant container, I run chmod 666 /var/run/docker.sock, then condor_restart
  3. I use condor_status slot1@xxxxxxxxxxxxxxxxx -json | grep Has to check the presence of the âHasDockerâ property
  4. I submit the following job:
    universe                              = docker
    docker_image                   = python:3.8.10
    should_transfer_files      = yes
    executable                         = /usr/bin/python
    arguments                          = test.py
    transfer_input_files         = test.py
    output                                 = test_docker.out
    error                                   = test_docker.err
    log                                       = test_docker.log
    initial_dir                           = /tmp
    queue 1

 

Hi Gaetan:

I think there's a bug with docker universe where it requires file transfer to be on.  Try adding the following to the submit file:

should_transfer_files = yes

when_to_transfer_output = on_exit

 

-greg