[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] home directory in docker universe - docker image's home directory not visible from inside condor job



Hi !

 I'm running docker universe jobs and found the following issue:

Setup:
  • created a docker image with 100 "pool user" condoruser001, condoruser002,.... They have their home directories and etc created inside the docker image.
  • all the separate condor job is mapped to slot users on the execute hosts condoruser001, condoruser002, etc...
Symptom:
  • when the docker image starts (job starts as condoruser001)  on the execute host the /home directory of the image is not visible, but somehow overmounted with the /home of the host machine
 Question:
  • How can I fix this ? I want that the jobs are seeing the docker image's home directories,

Thanks in advance, for any help, some conf and log below,
Gergely

--------------------

Relevant settings on the master side:

DEDICATED_EXECUTE_ACCOUNT_REGEXP = condoruser.*

DOCKER_VOLUMES = scratch, data, y, s, r, p, z
DOCKER_VOLUME_DIR_SCRATCH = /scratch
DOCKER_VOLUME_DIR_DATA = /data
DOCKER_VOLUME_DIR_Y = /y
DOCKER_VOLUME_DIR_S = /s
DOCKER_VOLUME_DIR_R = /r
DOCKER_VOLUME_DIR_P = /p
DOCKER_VOLUME_DIR_Z = /z
DOCKER_MOUNT_VOLUMES = scratch, data, y, s, r, p, z

ps -aux output on the execute host:

condor   2240386  0.1  0.0 106944 13760 ?        Ss   07:48   0:00 condor_starter -f -a slot1_2 10.1.0.210
condor   2240390  0.0  0.0 1643144 50472 ?       Ssl  07:48   0:00 /usr/bin/docker run --cpu-shares=100 --memory=128m --cap-drop=all --hostname gdebrecz-262.0-scorpio005 --name HTCJob262_0_slot1_2_PID2240386 -e TEMP=/var/lib/condor/execute/dir_2240386 -e _CONDOR_SCRATCH_DIR=/var/lib/condor/execute/dir_2240386 -e _CONDOR_SLOT=slot1_2 -e _CONDOR_CHIRP_CONFIG=/var/lib/condor/execute/dir_2240386/.chirp.config -e BATCH_SYSTEM=HTCondor -e TMPDIR=/var/lib/condor/execute/dir_2240386 -e _CONDOR_JOB_PIDS= -e TMP=/var/lib/condor/execute/dir_2240386 -e OMP_NUM_THREADS=1 -e _CONDOR_AssignedGPUs=CUDA1 -e _CONDOR_JOB_AD=/var/lib/condor/execute/dir_2240386/.job.ad -e CUDA_VISIBLE_DEVICES=1 -e _CONDOR_JOB_IWD=/var/lib/condor/execute/dir_2240386 -e _CHIRP_DELAYED_UPDATE_PREFIX=Chirp* -e _CONDOR_MACHINE_AD=/var/lib/condor/execute/dir_2240386/.machine.ad -e GPU_DEVICE_ORDINAL=1 --volume /var/lib/condor/execute/dir_2240386:/var/lib/condor/execute/dir_2240386 --volume /scratch:/scratch --volume /data:/data --volume /y:/y --volume /s:/s --volume /r:/r --volume /p:/p --volume /z:/z --workdir /var/lib/condor/execute/dir_2240386 --user 1006:1006 dani_tensorflow:v_02 ./condor_exec.exe 0
root     2240426  0.0  0.0  10732  5248 ?        Sl   07:48   0:00 containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/b2d72aff26495b5dd2926c02e125e09e71be06f58d87e532c8471beb4a4ce4c6 -address /var/run/docker/containerd/containerd.sock -containerd-binary /usr/bin/containerd -runtime-root /var/run/docker/runtime-nvidia
condoru+ 2240446  0.2  0.0  18056  2948 ?        Ss   07:48   0:00 /bin/bash ./condor_exec.exe 0


This e-mail and any files transmitted with it contain confidential and may contain privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized use, copying, disclosure or distribution of the material in this e-mail is strictly forbidden.