[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Jobs fail after updating from 10.5.0 to 10.6.0/10.7.0



Hello,

We haveÂWNs in AlmaLinux 9 with HTCondor 10.5.0 that were running apparently fine. However, after updating to 10.6.0 (or 10.7.0), new jobs are not correctly executed. There are these errors in the StarterLog.slotX_X:

09/01/23 07:23:30 (pid:54345) Create_Process succeeded, pid=54393
09/01/23 07:23:30 (pid:54345) Process exited, pid=54393, status=127
09/01/23 07:23:30 (pid:54345) JobReaper: condor_pid_ns_init didn't drop filename /home/execute/dir_54345/.condor_pid_ns_status (2)
09/01/23 07:23:30 (pid:54345) ERROR "Starter configured to use PID NAMESPACES, but libexec/condor_pid_ns_init did not run properly" at line 751 in file /var/lib/condor/execute/slot1/dir_3398586/userdir/build-ytPdzf/BUILD/condor-10.7.0/src/condor_starter.V6.1/vanilla_proc.cpp

09/01/23 07:23:30 (pid:54345) ShutdownFast all jobs.

I do not see in StartLog any other hint:

106336 09/01/23 07:23:30 Starter pid 54345 exited with status 4
106337 09/01/23 07:23:30 slot1_1: State change: starter exited
106338 09/01/23 07:23:30 slot1_1: Changing activity: Busy -> Idle

Reading again the version history, I'm not sure what change generates this error. Has anyone had a similar problem?

Thank you in advance.

Best regards,

Carles

--
Carles Acosta i Silva
PIC (Port d'Informacià CientÃfica)
Campus UAB, Edifici D
E-08193 Bellaterra, Barcelona
Tel: +34 93 581 33 08
Fax: +34 93 581 41 10
AvÃs - Aviso - Legal Notice: Âhttp://legal.ifae.es