[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] job Singularity submission/forced image starts



Hi Thomas,

Am 20.02.2018 um 17:48 schrieb Thomas Hartmann:
> Hi all,
> 
> I am somewhat struggling to get jobs to start natively in Singularity.
> 
> Neither hard-wiring jobs to start transparently in a given Singularity
> image nor with the user-side enabled image selection, I have managed to
> get an (interactive) test job to start in the container.
> On the node, the job starts runs without complaints but ssh and its
> children are always directly attached to condor_starter without being
> wrapped by Singularity (judging from ps). So far I have not found
> something in the logs giving a hint.

You can not judge from ps. 

The problem is that Singularity will not show up in the process tree,
for example you can try:

$ singularity shell some-container
Singularity: Invoking an interactive shell within container...
Singularity ubuntu-ssh:~/Ubuntu> sleep 100

on your local machine, and the process tree would look like:
\_ /bin/zsh     <= my shell on the host where I start singularity
    \_ /bin/bash --norc        <= shell inside the container
        \_ sleep 100           <= program inside the shell

Singularity itself will not show up anymore after setting up the namespaces, bind mounts etc.
The same is true for other "lightweight" container solutions, such as charliecloud. 

However, if singularity's implementation of the PID namespace would be correct / complete (HTCondor uses that),
it would still show up with a shim-init process (c.f. https://github.com/singularityware/singularity/pull/1221#issuecomment-367036129 ). 

This all may change in future versions of Singularity (I'm pretty sure of it), since they plan to rework the full infrastructure
(see e.g. https://www.sylabs.io/2018/02/singularity-golang/ ). It seems they plan to run a separate privileged RPC server process
to handle privileged activities, which feels very similar to Docker. 

But right now, the only way to check that things worked out fine is to look at the actual environment you see inside the container, 
or check the actual namespaces it is living in (e.g. by checking /proc/$PID/ns/* of one of the processes "inside" the container and comparing it against the host namespaces). 

Cheers,
	Oliver

> 
> Maybe somebody has an idea, what I might have missed?
> 
> Cheers and Thanks,
>   Thomas
> 
> ps: condor is on 8.6.8
> condor-classads-8.6.8-1.el7.x86_64
> condor-procd-8.6.8-1.el7.x86_64
> condor-python-8.6.8-1.el7.x86_64
> condor-8.6.8-1.el7.x86_64
> condor-external-libs-8.6.8-1.el7.x86_64
> 
> 
> 
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/
>