[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Usage of singularity for interactive jobs



Dear experts, 

since my last request was answered so quickly and helpfully, I'm now at the next step (of course still using htcondor 8.6.5). 
We would like to use singularity, and still allow for interactive jobs (so users can test in the same environment their job will use). 

I'm right now using a simple Ubuntu singularity container with sshd installed (I guess that's needed?). 

I have immediately hit several problems, but maybe I am just missing some magic: 
1) "bind path" specifications from singularity.conf are effectively ignored, since htcondor passes "-C" to singularity. 
2) SINGULARITY_BIND_EXPR does not really work as expected for me. 
   Using:
   SINGULARITY_BIND_EXPR = "/usr/libexec/condor/", "/pool"
   singularity is passed only "-B /usr/libexec/condor/" and "pool" is just ignored. 
   Is my Syntax wrong? Using:
   SINGULARITY_BIND_EXPR = "/usr/libexec/condor/:/usr/libexec/condor/", "/pool:/pool"
   yields the same issue. 
3) condor_ssh_to_job requires sshd to be present inside the container. Providing that, I also need to use "UsePrivilegeSeparation no" in my condor_ssh_to_job_sshd_config_template, I think. 
   So far, so good... 
   However, since sshd is run in the container, I have two issues now. sshd is called like: 
   /usr/sbin/sshd -i -e -f /pool/condor/dir_<PID>/.condor_ssh_to_job_1/sshd_config
   Singularity is called like:
   singularity exec -B /pool/condor/dir_<PID>:/srv --pwd /srv -S /var/run -S /var/tmp -S /tmp -C /scratch/foo.img /bin/sleep 180
   a) So that means, I have to provide an additional bind path for "pool", I guess? 
      It might be easier to recycle /srv here (but that probably needs some patch to htcondor?)
   b) Providing that bind mount, I am still missing (inside the container):
      /usr/libexec/condor/condor_ssh_to_job_shell_setup
      which is enforced through via authorized_keys. 
      So a first "hack" would be to provide an additional bind mount of /usr/libexec/condor/ inside the container. 
      Due to (2), I can sadly only provide one bind mount at a time :-(. 

Am I doing something wrong / missing something, or are these all open issues with the initial singularity support?
Using singularity for interactive jobs is a real "killer feature" since users can e.g. compile their code in the environment they want without having a physical machine,
so if there's any way to make it work, I'll gladly follow it. 

Cheers, many thanks in advance and all the best, 
	Oliver

-- 
Oliver Freyermuth
Physikalisches Institut der UniversitÃt Bonn
NuÃallee 12
53115 Bonn
--