[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Problems with singularity version 3.8.1




First, apologies.  I apparently wasnât actually subscribed to this list until yesterday, so I canât really âreplyâ to the thread I would like to.  But the subject line is the thread Iâm trying to reply to.

Greg, we have either more data for Matthiasâs bug report, or a similar but subtly different problem.  Please let us know if you want more information.

Starting about a week ago we had a user seeing her jobs held with messages such as this:

007 (42052443.000.000) 2021-09-01 10:30:09 Shadow exception!
Error from slot1_4@xxxxxxxxxxxxxxxxxxxxxxxx: Singularity test failed:INFO:    Could not find any nv files on this host!
0  -  Run Bytes Sent By Job
2084  -  Run Bytes Received By Job

If I directly test the same singularity image from the command line, I see:

[joshua.willis@ldas-osg ~]$ singularity test --nv /home/rebecca.ewing/observing/4/dev/builds/gstlal_dev-082721 ; echo $?
INFO:    Could not find any nv files on this host!
INFO:    No test script found in container, exiting
No test found in container, executing /bin/sh -c true
0

That is, an additional warning line, but the error code of the test is actually still zero.

If I omit the â--nvâ I donât get the message about not finding nv files (unsurprisingly).

We think that last point might be relevant because James can, with his standard test jobs, reproduce the error at CIT when submitting from either HTCondor 9.0.4 or 9.0.5, and singularity 3.8.1.  However those same jobs succeed when they come into CIT from OSG, even though the version of singularity is the same.  So we suspect that maybe âsingularity testâ is not always invoked with âânvâ, but perhaps itâs something else.

If you can confirm that this is the same problem Matthias saw, then we will happily await the patch for testing.  Otherwise we wanted to alert you that there may be a different but similar problem.

Cheers,

Josh

--
Josh Willis
jlwillis@xxxxxxxxxxx

Computational Scientist
Caltech/LIGO