[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Singularity not mounting the home directory for a submitted job



Hi folks,

Does anyone have some insight into how Singularity decides when to bind-mount the user's home directory?

I've got a situation where when I run the command on the exec machine directly, the container spins up successfully with the home directory in place, but when started via condor_run -a 'MY.SingularityImage = "imagefile"' or via a submit description, the container doesn't get the home directory and thus the launch of the application fails. 

The bind mount arguments include the /home/condor directory since that's where the global libexec lives in my config, but omitting that didn't change the behavior. An error message I see is "/home/user already mounted in container" - so it's apparently detecting something that isn't actually there. 

I have a user_job_wrapper in place, but omitting that didn't change this behavior. I wrapped the Singularity binary as well in order to add the "--nv" option for jobs where it's appropriate until SINGULARITY_EXTRA_ARGUMENTS is in a production release, but that was another dead end investigation. I tried inserting a "-H ${HOME}:$HOME" argument along with --nv, but that seems to be getting ignored with the "already mounted in container" message.

What's even more peculiar is that it works for some user accounts, but not others. I checked the environment variables for any differences, but nothing jumped out at me, and condor_run does a getenv=True in any case.

Any suggestions? Thanks

Michael V. Pelletier
Information Technology
Digital Transformation & Innovation
Integrated Defense Systems
Raytheon Company