[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] job Singularity submission/forced image starts



Hi Oliver,

many thanks for the details! I fooled myself then ;)

Currently the namespaces of a job's processes are still the same as the
default kernel namespaces - but now I should have some leverage for
debugging ;)
Actually, it looks like some clash with bind-mounting the job's home dir
to a path in the containers namespace [1] (when naively trying to start
a container in an interactive session) - might be due the thing that to
historical reasons we bind mount homes to another path... :-/


Cheers and thanks,
  Thomas

[1]
> singularity shell
/cvmfs/singularity.opensciencegrid.org/atlas/analysisbase\:21.2.4/
...
VERBOSE: Creating bind point within container: /dev/urandom
VERBOSE: Mounting home directory source into session directory:
/var/home/opsusr019 -> /var/singularity/mnt/session/var/home/opsusr019
VERBOSE: Failed to create parent directory
/var/singularity/mnt/final/var/home/opsusr019
ERROR  : Failed creating home directory in container
/var/singularity/mnt/final/var/home/opsusr019: Operation not supported
ABORT  : Retval = 255

[2]
findmnt | grep  "\["
ââ/tmp                                     /dev/sda6[/tmp]  ext4
rw,relatime,data=ordered
ââ/home                                    /dev/sda6[/home] ext4
rw,relatime,data=ordered


On 2018-02-21 12:36, Oliver Freyermuth wrote:
> Hi Thomas,
> 
> Am 20.02.2018 um 17:48 schrieb Thomas Hartmann:
>> Hi all,
>>
>> I am somewhat struggling to get jobs to start natively in Singularity.
>>
>> Neither hard-wiring jobs to start transparently in a given Singularity
>> image nor with the user-side enabled image selection, I have managed to
>> get an (interactive) test job to start in the container.
>> On the node, the job starts runs without complaints but ssh and its
>> children are always directly attached to condor_starter without being
>> wrapped by Singularity (judging from ps). So far I have not found
>> something in the logs giving a hint.
> 
> You can not judge from ps. 
> 
> The problem is that Singularity will not show up in the process tree,
> for example you can try:
> 
> $ singularity shell some-container
> Singularity: Invoking an interactive shell within container...
> Singularity ubuntu-ssh:~/Ubuntu> sleep 100
> 
> on your local machine, and the process tree would look like:
> \_ /bin/zsh     <= my shell on the host where I start singularity
>     \_ /bin/bash --norc        <= shell inside the container
>         \_ sleep 100           <= program inside the shell
> 
> Singularity itself will not show up anymore after setting up the namespaces, bind mounts etc.
> The same is true for other "lightweight" container solutions, such as charliecloud. 
> 
> However, if singularity's implementation of the PID namespace would be correct / complete (HTCondor uses that),
> it would still show up with a shim-init process (c.f. https://github.com/singularityware/singularity/pull/1221#issuecomment-367036129 ). 
> 
> This all may change in future versions of Singularity (I'm pretty sure of it), since they plan to rework the full infrastructure
> (see e.g. https://www.sylabs.io/2018/02/singularity-golang/ ). It seems they plan to run a separate privileged RPC server process
> to handle privileged activities, which feels very similar to Docker. 
> 
> But right now, the only way to check that things worked out fine is to look at the actual environment you see inside the container, 
> or check the actual namespaces it is living in (e.g. by checking /proc/$PID/ns/* of one of the processes "inside" the container and comparing it against the host namespaces). 
> 
> Cheers,
> 	Oliver
> 
>>
>> Maybe somebody has an idea, what I might have missed?
>>
>> Cheers and Thanks,
>>   Thomas
>>
>> ps: condor is on 8.6.8
>> condor-classads-8.6.8-1.el7.x86_64
>> condor-procd-8.6.8-1.el7.x86_64
>> condor-python-8.6.8-1.el7.x86_64
>> condor-8.6.8-1.el7.x86_64
>> condor-external-libs-8.6.8-1.el7.x86_64
>>
>>
>>
>> _______________________________________________
>> HTCondor-users mailing list
>> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/htcondor-users/
>>

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature