[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] STARTER at 192.168.22.18 failed to send file(s) to <192.168.178.102:27682>; SHADOW ...



Hi Stephen,

Is it possible that the interaction of arc and HTCondor is causing permission change on the directory itself?  I would look at the spool directory permissions.

Brian

> On Aug 1, 2016, at 7:15 AM, Stephen Jones <sjones@xxxxxxxxxxxxxxxx> wrote:
> 
> Hi everybody,
> 
> When running jobs on the cluster, I come across a problem with files. The first thing I see is a job in H state:
> 
> # condor_q 40178.0
> 
> 40178.0   someuser20        8/1  12:22   0+00:08:46 H  0   4150.4 (arc_pilot)
> 
> So, looking at that more deeply, I see this:
> 
> # condor_q -long 40178.0 | grep HoldReason
> 
> LastHoldReason = "Error from slot1@xxxxxxxxxxxxxxxxxxxx: STARTER at 192.168.22.18 failed to send file(s) to <192.168.178.102:27682>; SHADOW at 192.168.178.102 failed to write to file /var/spool/arc/grid/Sx5KDmdTQoonwOMCrq6pnv5nABFKDmABFKDmqtHKDmABFKDmgWcnhn/_condor_stdout.some.url_122737.0_1470050528: (errno 13) Permission denied"
> HoldReasonSubCode = 13
> HoldReason = "Error from slot1@xxxxxxxxxxxxxxxxxxxx: STARTER at 192.168.26.3 failed to send file(s) to <192.168.178.102:24166>; SHADOW at 192.168.178.102 failed to write to file /var/spool/arc/grid/Sx5KDmdTQoonwOMCrq6pnv5nABFKDmABFKDmqtHKDmABFKDmgWcnhn/_condor_stderr.some.url_122737.0_1470050528: (errno 13) Permission denied"
> LastHoldReasonSubCode = 13
> HoldReasonCode = 12
> LastHoldReasonCode = 12
> 
> And I wonder what the permissions on that file are, so I look at that too.
> 
> # ls -lrt /var/spool/arc/grid/Sx5KDmdTQoonwOMCrq6pnv5nABFKDmABFKDmqtHKDmABFKDmgWcnhn/_condor_stdout.some.url_122737.0_1470050528
> 
> -r-------- 1 someuser20 grpabc 34876 Aug  1 12:56 /var/spool/arc/grid/Sx5KDmdTQoonwOMCrq6pnv5nABFKDmABFKDmqtHKDmABFKDmgWcnhn/_condor_stdout.some.url_122737.0_1470050528
> 
> And I wonder if that 400 in the permissions is a problem. And what can be done about it?
> 
> Anyone got any tips on this?:
> 
> Cheers,
> 
> Ste
> 
> -- 
> Steve Jones                             sjones@xxxxxxxxxxxxxxxx
> Grid System Administrator               office: 220
> High Energy Physics Division            tel (int): 43396
> Oliver Lodge Laboratory                 tel (ext): +44 (0)151 794 3396
> University of Liverpool                 http://www.liv.ac.uk/physics/hep/
> 
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/