Mailing List Archives
Public Access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[HTCondor-users] Cann't run singularity container via HTCondor job
- Date: Thu, 27 Sep 2018 15:23:36 +0300
- From: Evgeniy Kuznetsov <evkuz@xxxxxxx>
- Subject: [HTCondor-users] Cann't run singularity container via HTCondor job
Hi all,
I cann't run HTCondor job under singularity on execute host.
The job is submited but the error log on submit host says:
ERRORÂ : Home directory is not owned by calling user: /
ABORTÂ : Retval = 255
My configuration data:
execute node: singularity --version 2.6.0-HEAD.579c415, CentOS 7,
Condor V8.6.12
submit node: singularity --version 2.6.0-HEAD.579c415, CentOS 6,
Condor V8.6.12
On execute host I can see the output of singularity commands running
manually not from root
$ singularity run /tmp/hello-world.simg
RaawwWWWWWRRRR!!
$ singularity exec /tmp/hello-world.simg cat /etc/os-release
NAME="Ubuntu"
VERSION="14.04.5 LTS, Trusty Tahr"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 14.04.5 LTS"
VERSION_ID="14.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
So the singularity runs well on the execute node.
But I cann't run simple singularity container 'hello-world' via HTCondor
job.
Here are my startd configuration parameters for singularity ("User
Request" variant as shown by Brian Bockelman) :
SINGULARITY = /usr/local/bin/singularity
SINGULARITY_JOB = !isUndefined(TARGET.SingularityImage)
SINGULARITY_IMAGE_EXPR = TARGET.SingularityImage
And submit file :
------------------------
Universe = vanilla
executable = singularity_hello.sh
requirements = (Machine == "execute node")
+SingularityImage = "/tmp/hello-world.simg"
should_transfer_files = IF_NEEDED
when_to_transfer_output = ON_EXIT
output = out
error = err
log = log
queue
------------------------------
Executable script :
-----------------------------
#!/bin/bash
date
echo "I'm process id $$ on" `hostname`
echo "This is sent to standard error" 1>&2
echo "Running as binary $0" "$@"
cat /etc/os-release
-----------------------------------
Having such a configuration I hoped to get the output of "singularity
run /tmp/hello-world.simg" command as if it was typed on execute host
shell, not inside the singularity container. Or at least the output of
"cat /etc/os-release" command running inside the container.
But executable even not started as there is no output from 'echo'
commands in 'out' file (it's empty) , mentioned in submit file.
Also here is the log extraction from execute host' StarterLog.slot1
...
09/27/18 13:53:53 (pid:19713) Job 109.0 set to execute immediately
09/27/18 13:53:53 (pid:19713) Starting a VANILLA universe job with ID: 109.0
09/27/18 13:53:53 (pid:19713) IWD: /var/lib/condor/execute/dir_19713
09/27/18 13:53:53 (pid:19713) Output file:
/var/lib/condor/execute/dir_19713/_condor_stdout
09/27/18 13:53:53 (pid:19713) Error file:
/var/lib/condor/execute/dir_19713/_condor_stderr
09/27/18 13:53:53 (pid:19713) Renice expr "0" evaluated to 0
09/27/18 13:53:53 (pid:19713) About to exec
/var/lib/condor/execute/dir_19713/condor_exec.exe
09/27/18 13:53:53 (pid:19713) Running job via singularity.
09/27/18 13:53:54 (pid:19713) Create_Process succeeded, pid=19728
09/27/18 13:53:54 (pid:19713) Process exited, pid=19728, status=255
09/27/18 13:53:54 (pid:19713) Got SIGQUIT. Performing fast shutdown.
09/27/18 13:53:54 (pid:19713) ShutdownFast all jobs.
09/27/18 13:53:54 (pid:19713) **** condor_starter (condor_STARTER) pid
19713 EXITING WITH STATUS 0
-------------
Finally, the singularity installed and runs well on execute node.
HTCondor tries to run the job under singularity but there is some
permission/owner issues that I don't understand.
Please direct me to relevant point.
Evgeny.