[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] STARTER at 192.168.22.18 failed to send file(s) to <192.168.178.102:27682>; SHADOW ...
- Date: Mon, 01 Aug 2016 07:40:12 -0500
- From: Brian Bockelman <bbockelm@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] STARTER at 192.168.22.18 failed to send file(s) to <192.168.178.102:27682>; SHADOW ...
Is it possible that the interaction of arc and HTCondor is causing permission change on the directory itself? I would look at the spool directory permissions.
> On Aug 1, 2016, at 7:15 AM, Stephen Jones <sjones@xxxxxxxxxxxxxxxx> wrote:
> Hi everybody,
> When running jobs on the cluster, I come across a problem with files. The first thing I see is a job in H state:
> # condor_q 40178.0
> 40178.0 someuser20 8/1 12:22 0+00:08:46 H 0 4150.4 (arc_pilot)
> So, looking at that more deeply, I see this:
> # condor_q -long 40178.0 | grep HoldReason
> LastHoldReason = "Error from slot1@xxxxxxxxxxxxxxxxxxxx: STARTER at 192.168.22.18 failed to send file(s) to <192.168.178.102:27682>; SHADOW at 192.168.178.102 failed to write to file /var/spool/arc/grid/Sx5KDmdTQoonwOMCrq6pnv5nABFKDmABFKDmqtHKDmABFKDmgWcnhn/_condor_stdout.some.url_122737.0_1470050528: (errno 13) Permission denied"
> HoldReasonSubCode = 13
> HoldReason = "Error from slot1@xxxxxxxxxxxxxxxxxxxx: STARTER at 192.168.26.3 failed to send file(s) to <192.168.178.102:24166>; SHADOW at 192.168.178.102 failed to write to file /var/spool/arc/grid/Sx5KDmdTQoonwOMCrq6pnv5nABFKDmABFKDmqtHKDmABFKDmgWcnhn/_condor_stderr.some.url_122737.0_1470050528: (errno 13) Permission denied"
> LastHoldReasonSubCode = 13
> HoldReasonCode = 12
> LastHoldReasonCode = 12
> And I wonder what the permissions on that file are, so I look at that too.
> # ls -lrt /var/spool/arc/grid/Sx5KDmdTQoonwOMCrq6pnv5nABFKDmABFKDmqtHKDmABFKDmgWcnhn/_condor_stdout.some.url_122737.0_1470050528
> -r-------- 1 someuser20 grpabc 34876 Aug 1 12:56 /var/spool/arc/grid/Sx5KDmdTQoonwOMCrq6pnv5nABFKDmABFKDmqtHKDmABFKDmgWcnhn/_condor_stdout.some.url_122737.0_1470050528
> And I wonder if that 400 in the permissions is a problem. And what can be done about it?
> Anyone got any tips on this?:
> Steve Jones sjones@xxxxxxxxxxxxxxxx
> Grid System Administrator office: 220
> High Energy Physics Division tel (int): 43396
> Oliver Lodge Laboratory tel (ext): +44 (0)151 794 3396
> University of Liverpool http://www.liv.ac.uk/physics/hep/
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> The archives can be found at: