[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] 'permission denied' brought some jobs to H status



Hi, Jaime:

Thanks for the idea. Gareth Roy has uploaded our search results to the GGUS ticket, so maybe it's better to discuss the details there?

  https://ggus.eu/index.php?mode=ticket_info&ticket_id=113745

Actually I also tested a job using glite-wms-job-submit, when this job requests 1 cpu, it get hold, when it request > 1 cpu, it finished successfully. So seems more like a job wrapper thing in ARC.

  Cheers,Gang

On 04/06/2015 22:28, Jaime Frey wrote:
On Jun 4, 2015, at 10:41 AM, qing <gang.qin@xxxxxxxxxxxxx> wrote:
Dear Zach:

  Thanks for the hints. This seems to be a common issue with ARC 5.0.0 and has already been tracked at https://ggus.eu/index.php?mode=ticket_info&ticket_id=113745, where Condor team indicates the the problem might be in the ARC job wrapper.

  Cheers,Gang

On 04/06/2015 16:29, Zachary Miller wrote:
Hold reason: Error from slot1@node128: STARTER at 10.141.0.128 failed to send file(s) to <10.141.255.19:57731>; SHADOW at 10.141.255.19 failed to write to file /var/spool/arc/grid/dfgMDmjkVKmnbbfC3pqhhxZmABFKDmABFKDmZnFKDmABFKDm7g3Yon/_condor_stderr.aipanda063.cern.ch_15422080.0_1433368150: (errno 13) Permission denied
This "Permission denied" is coming from the filesystem on the submit machine.

Some things to consider:

   Is it local disk, or some kind of network mount?
   Was it clost to full? (such that the output files won't fit)
   Too many files in /var/spool/arc/grid/?
   Any other reason you can think of why you wouldn't be able to write a file.
Hi.
Iâve been working with Raul about this problem. My current suspicion is that this is caused by using a Condor-G version older than 8.0.5 to submit the jobs to the ARC server. This causing a collision on the filename _condor_stderr between Condor-G and the HTCondor cluster behind the ARC sever.

Can you confirm if this is the case for the jobs that are failing for you?

Thanks and regards,
Jaime Frey
UW-Madison HTCondor Project


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/