Mailing List Archives
Public Access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] .gahp
- Date: Mon, 26 Mar 2018 17:57:01 +0100
- From: Stephen Jones <sjones@xxxxxxxxxxxxxxxx>
- Subject: Re: [HTCondor-users] .gahp
Hi Condor Helpers,
At least two ARC/HTCondor sites inthe UK are getting file transfer
errors like this:
# cd /var/log/condor/
# grep .gahp ShadowLog*
ReliSock::put_file_with_permissions(): Failed to stat file
'/var/spool/arc/grid/u2FNDmPzfKsnKbMCrqsOzK9nABFKDmABFKDmnMUaDm9BFKDmwLXxtm/.gahp_complete':No
such file or directory (errno: 2, si_error: 1) DoUpload: (Condor error
code 13, subcode 2) SHADOW at 192.168.178.105 failed to send file(s) to
<192.168.26.14:27452>: error reading from
/var/spool/arc/grid/u2FNDmPzfKsnKbMCrqsOzK9nABFKDmABFKDmnMUaDm9BFKDmwLXxtm/.gahp_complete:(errno
2) No such file or directory; STARTER failed to receive file(s) from
<138.253.178.105:9618> Job 208640.0 going into Hold state (code 13,2):
Error from slot1_15@xxxxxxxxxxxxxxxxxxxx: SHADOW at 192.168.178.105
failed to send file(s) to <192.168.26.14:27452>: error reading from
/var/spool/arc/grid/u2FNDmPzfKsnKbMCrqsOzK9nABFKDmABFKDmnMUa
Dm9BFKDmwLXxtm/.gahp_complete: (errno 2) No such file or directory;
STARTER failed to receive file(s) from <138.253.178.105:9618> We're
using CentOS7, nordugrid-arc-5.4.1-1.el7.centos.x86_64 and
condor-8.6.3-1.el7.x86_64
They are very sporadic. Most jobs are fine. What's ".gahp_complete"? A
sort of signal file?
Anyway, I wondered if anyone else has seen this and knows what to do.
Cheers,
Ste