[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Usage of singularity for interactive jobs



Hi, 

the ptty creation issue has been fixed by singularity upstream, the fix is right now in PR https://github.com/singularityware/singularity/pull/871 . 

However, "condor_ssh_to_job" and also "condor_submit -interactive" do not work correctly. 
The problem is that condor runs sshd in a NEW container session, which is then shielded from the container in which the job itself is running. 
I tested that by checking how condor is calling "singularity", and it is indeed creating a container instance for the initial sleep (in case of interactive job) or the job (in case of normal job),
and another fresh session for the sshd. 

Are there plans to fix this / (how) are other sites using htcondor with singularity? 

Cheers and all the best, 
	Oliver

Am 04.08.2017 um 15:18 schrieb Oliver Freyermuth:
> Hi, 
> 
> just to let you know: I have now also asked the question concerning the "ptty-creation" in a "singularity -C" container to the singularity crowd:
> https://github.com/singularityware/singularity/issues/857
> Maybe, this issue can be solved better inside singularity. For the other points (which are mainly inconveniences), I guess HTCondor needs changes. 
> 
> Cheers and have a nice weekend, 
> 	Olver
> 
> Am 04.08.2017 um 01:04 schrieb Oliver Freyermuth:
>> Many thanks for your quick reply! 
>>
>> Am 04.08.2017 um 00:50 schrieb Todd L Miller:
>>>> 1) "bind path" specifications from singularity.conf are effectively ignored, since htcondor passes "-C" to singularity.
>>>
>>>     I believe this is on purpose, since you may not want random strangers from the internet to have the same set of bind mounts as your local users.
>> Ok, understood. I was just unsure whether this is a good default, but you convinced me that it is. 
>>>
>>>> 2) SINGULARITY_BIND_EXPR does not really work as expected for me.
>>>
>>>     I'm pretty sure that the _EXPR means that the configuration value must be a valid ClassAd expression.  For you, that probably means "a single ClassAd string".  Try
>>>
>>> SINGULARITY_BIND_EXPR = "/usr/libexec/condor/,/pool"
>>>
>>> instead.
>> You are perfectly right! With this trick in place, the interactive job starts up well! 
>> I am only missing the shell's "prompt" part (i.e. PS1 is somehow empty), but the shell in fact works, so maybe this is some environment issue I still have to look into. 
>>
>> Of course, a nicer solution would be to have this without the extra bind mounts (which expose parts of the file system the job does not really need). 
>> My suggestion would be to re-use the "/srv/" mountpoint which resolves to /pool/condor/dir_<PID> on the host,
>> and maybe copy /usr/libexec/condor/condor_ssh_to_job_shell_setup there for singularity-based jobs. 
>>
>> I guess that would need changes to the HTCondor code, though:
>> 1) Change the "%f" in SSH_TO_JOB_SSHD_ARGS for singularity jobs to use /srv as path.
>> 2) To do that additional copying step of the shell-setup script. 
>> 3) To use a modified path in the authorized_keys to look for the condor_ssh_to_job_shell_setup script. 
>>
>> Of course, an alternative solution for (2) and (3) could be to document that the container should provide some entry point like that - I'm not sure whether that's a good idea (but right now, also sshd is required...). 
>>
>> Many thanks and cheers, 
>> 	Oliver
>>
>>>
>>> - ToddM
>>> _______________________________________________
>>> HTCondor-users mailing list
>>> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
>>> subject: Unsubscribe
>>> You can also unsubscribe by visiting
>>> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>>>
>>> The archives can be found at:
>>> https://lists.cs.wisc.edu/archive/htcondor-users/
>>
>>


-- 
Oliver Freyermuth
UniversitÃt Bonn
Physikalisches Institut, Raum 1.047
NuÃallee 12
53115 Bonn
--