[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Issue when removing job from queue after completion



Hello all,

I'm having an issue with the removal of a job after completion, after updating to 8.4.9:

The job never switches to completed status despite the business part finishing correctly.

In the condor_shadow logs bellow I have errors that suggest a permission or user issue, but I couldn't find what it was about (either SetEffectiveOwner or PRIV_USER).

Any insight is appreciated.

Thanks and best regards,
Xavier

11/16/16 15:08:00 ******************************************************
11/16/16 15:08:00 ** condor_shadow (CONDOR_SHADOW) STARTING UP
11/16/16 15:08:00 ** [EDITED]/condor/sbin/condor_shadow
11/16/16 15:08:00 ** SubsystemInfo: name=SHADOW type=SHADOW(6) class=DAEMON(1)
11/16/16 15:08:00 ** Configuration: subsystem:SHADOW local:<NONE> class:DAEMON
11/16/16 15:08:00 ** $CondorVersion: 8.4.9 Sep 29 2016 BuildID: 382747 $
11/16/16 15:08:00 ** $CondorPlatform: x86_64_RedHat7 $
11/16/16 15:08:00 ** PID = 18636
11/16/16 15:08:00 ** Log last touched 11/16 15:07:58
11/16/16 15:08:00 ******************************************************
11/16/16 15:08:00 Using config source: [EDITED]/condor/etc/condor_config
11/16/16 15:08:00 Using local config sources:
11/16/16 15:08:00   [EDITED]/condor/etc/condor_config
11/16/16 15:08:00   [EDITED]/condor/configs/sodl06001
11/16/16 15:08:00 config Macros = 169, Sorted = 169, StringBytes = 4339, TablesBytes = 2760
11/16/16 15:08:00 CLASSAD_CACHING is OFF
11/16/16 15:08:00 Daemon Log is logging: D_ALWAYS D_ERROR
11/16/16 15:08:00 Daemoncore: Listening at <0.0.0.0:38591> on TCP (ReliSock).
11/16/16 15:08:00 DaemonCore: command socket at <192.168.11.46:38591?addrs=192.168.11.46-38591&noUDP>
11/16/16 15:08:00 DaemonCore: private command socket at <192.168.11.46:38591?addrs=192.168.11.46-38591>
11/16/16 15:08:00 Initializing a JAVA shadow for job 6.0
11/16/16 15:08:00 (6.0) (18636): Request to run on slot1@xxxxxxxxxxxxxxxxxxxxxx <192.168.11.46:42195?addrs=192.168.11.46-42195> was ACCEPTED
11/16/16 15:08:00 (6.0) (18636): Can't open directory "[EDITED]/spool/6/0/cluster6.proc0.subproc0.tmp" as PRIV_USER, errno: 2 (No such file or directory)
11/16/16 15:08:00 (6.0) (18636): Can't open directory "[EDITED]/spool/6/0/cluster6.proc0.subproc0" as PRIV_USER, errno: 2 (No such file or directory)
11/16/16 15:08:00 (6.0) (18636): SetEffectiveOwner(xjanin) failed with errno=13: Permission denied.
11/16/16 15:08:00 (6.0) (18636): File transfer completed successfully.
11/16/16 15:08:00 (6.0) (18636): SetEffectiveOwner(xjanin) failed with errno=13: Permission denied.
11/16/16 15:08:03 (6.0) (18636): SetEffectiveOwner(xjanin) failed with errno=13: Permission denied.
11/16/16 15:08:03 (6.0) (18636): File transfer completed successfully.
11/16/16 15:08:03 (6.0) (18636): SetEffectiveOwner(xjanin) failed with errno=13: Permission denied.
11/16/16 15:08:03 (6.0) (18636): SetEffectiveOwner(xjanin) failed with errno=13: Permission denied.
11/16/16 15:08:03 (6.0) (18636): Failed to perform final update to job queue!
11/16/16 15:08:33 (6.0) (18636): Retrying job cleanup, calling terminateJob()
11/16/16 15:08:33 (6.0) (18636): SetEffectiveOwner(xjanin) failed with errno=13: Permission denied.
11/16/16 15:08:33 (6.0) (18636): Failed to perform final update to job queue!




 

Please consider the environment before printing this email message.


Ce courriel (incluant ses Ãventuelles piÃces jointes) peut contenir des informations confidentielles et/ou protÃgÃes ou dont la diffusion est restreinte. Si vous avez reÃu ce courriel par erreur, vous ne devez ni le copier, ni l'utiliser, ni en divulguer le contenu à quiconque. Merci d'en avertir immÃdiatement l'expÃditeur et d'effacer ce courriel de votre systÃme. Airbus DS Geo dÃcline toute responsabilità en cas de corruption par virus, d'altÃration ou de falsification de ce courriel lors de sa transmission par voie Ãlectronique.

This email (including any attachments) may contain confidential and/or privileged information or information otherwise protected from disclosure. If you are not the intended recipient, please notify the sender immediately, do not copy this message or any attachments and do not use it for any purpose or disclose its content to any person, but delete this message and any attachments from your system. Airbus DS Geo disclaims any and all liability if this email transmission was virus corrupted, altered or falsified.


Airbus DS Geo SA (325 089 589 RCS Toulouse) - Siege social: 5, rue des Satellites, 31400 Toulouse, France.