[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] .gahp



Hi Condor Helpers,

At least two ARC/HTCondor sites inthe UK are getting file transfer errors like this:

# cd /var/log/condor/
# grep .gahp ShadowLog*

ReliSock::put_file_with_permissions(): Failed to stat file '/var/spool/arc/grid/u2FNDmPzfKsnKbMCrqsOzK9nABFKDmABFKDmnMUaDm9BFKDmwLXxtm/.gahp_complete':No such file or directory (errno: 2, si_error: 1) DoUpload: (Condor error code 13, subcode 2) SHADOW at 192.168.178.105 failed to send file(s) to <192.168.26.14:27452>: error reading from /var/spool/arc/grid/u2FNDmPzfKsnKbMCrqsOzK9nABFKDmABFKDmnMUaDm9BFKDmwLXxtm/.gahp_complete:(errno 2) No such file or directory; STARTER failed to receive file(s) from <138.253.178.105:9618> Job 208640.0 going into Hold state (code 13,2): Error from slot1_15@xxxxxxxxxxxxxxxxxxxx: SHADOW at 192.168.178.105 failed to send file(s) to <192.168.26.14:27452>: error reading from /var/spool/arc/grid/u2FNDmPzfKsnKbMCrqsOzK9nABFKDmABFKDmnMUa Dm9BFKDmwLXxtm/.gahp_complete: (errno 2) No such file or directory; STARTER failed to receive file(s) from <138.253.178.105:9618> We're using CentOS7, nordugrid-arc-5.4.1-1.el7.centos.x86_64 and condor-8.6.3-1.el7.x86_64

They are very sporadic. Most jobs are fine. What's ".gahp_complete"? A sort of signal file?

Anyway, I wondered if anyone else has seen this and knows what to do.


Cheers,

Ste