[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Singularity Integration



Hi

we have a few researchers who use condor with singularity and some time in the past couple of weeks started to have jobs failing because of the scratch mounting of /var/tmp

a submit file like :


universe = vanilla
executable = ls
arguments = -la /var/ /
should_transfer_files = YES
+SingularityImage = "/cvmfs/ligo-containers.opensciencegrid.org/lscsoft/bayeswave/master"
transfer_executable = False
output = $(cluster).$(Process).out
error = $(cluster).$(Process).err
log = $(cluster).$(Process).log
queue

fails with
ERRORÂ : Not mounting requested scratch directory (already mounted in container): /var/tmp
ABORTÂ : Retval = 255

while

universe = vanilla
executable = /usr/bin/singularity
arguments = exec -w /cvmfs/ligo-containers.opensciencegrid.org/lscsoft/bayeswave/master ls -la /var /
output = basic.$(cluster).out
error = basic.$(cluster).err
log = basic.$(cluster).log

produces the expected result

I've not been able to identify precisely the moment where the regression started to occur but we have it with condor 8.6.12, singularity 2.5.2 and 2.6.0

from the condor logs I could see the singularity invocation looks like :

ÂArguments updated for executing with singularity: /usr/bin/singularity exec -B /cvmfs -B /archive -B /hdfs -B /home:/home:rw -B /local/condor/execute/dir_101929:/srv --pwd /srv -S Owner) -S /local/" -S /var/tmp -S strcat( "/tmp -C /cvmfs/ligo-containers.opensciencegrid.org/lscsoft/bayeswave/master /etc/condor/modules/user-job-wrapper /local/condor/execute/dir_101929/ls -la /var/ /

which does not seem right (the closing parenthesis of the strcat is before the opening, the executable command, ls, is glued to one of the arguments , etc).

using mount tmp = no in singularity.conf or fiddling with the MOUNT_SCRATCH_DIR knob in condor does not seem to improve the situation

How can I make the first form of my submit file work ?


best
Philippe Grassia