Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Globus error 129 with large files

Date: Fri, 11 Mar 2016 14:39:59 -0600
From: Brian Bockelman <bbockelm@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Globus error 129 with large files

Hi Emir,

Is it possible that the threshold between working and not working is either 2.1GB (about 2^31) or 4.2GB (about 2^32)?  That would help narrow down the potential sources of error.  

Brian

> On Mar 11, 2016, at 2:22 PM, Emir Imamagic <eimamagi@xxxxxxx> wrote:
> 
> Hello,
> 
> we are using Condor (8.4.3) for submitting jobs to remote clusters through Globus GRAM 5.2.
> 
> Lately we have noticed a lot of jobs switching to status Hold with the following reason:
> - Globus error 129: the standard output/error size is different
> We instructed users to properly define Output/Error files as described here:
> https://twiki.opensciencegrid.org/bin/view/Documentation/Release3/GlobusError129
> 
> However, errors continued. After some analysis we found out that the problem occurs when either Output, Error or one of transfer_output_files is very large (e.g. > XX GBs).
> 
> Here is a script that ends up in error state each time we run it:
> Executable=run.sh
> transfer_output_files=testmonkey
> Output=run.$(Cluster).out
> Error=run.$(Cluster).err
> universe=grid
> grid_resource=gt5 grid.resource.hr/jobmanager-sge
> queue
> 
> Where run.sh script is the following:
> #!/bin/sh
> dd if=/dev/zero of=./testmonkey bs=1M count=50000
> 
> 
> We haven't yet performed more thorough analysis as we wanted to check with the list if this is some known issue. Please let me know if I can provide more info.
> 
> Thanks in advance
> -- 
> Emir Imamagic
> SRCE - University of Zagreb University Computing Centre, www.srce.unizg.hr
> Emir.Imamagic@xxxxxxx, tel: +385 1 616 5809, fax: +385 1 616 5559
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/

Follow-Ups:
- Re: [HTCondor-users] Globus error 129 with large files
  - From: Emir Imamagic

References:
- [HTCondor-users] Globus error 129 with large files
  - From: Emir Imamagic

Prev by Date: [HTCondor-users] Globus error 129 with large files
Next by Date: Re: [HTCondor-users] Globus error 129 with large files
Previous by thread: [HTCondor-users] Globus error 129 with large files
Next by thread: Re: [HTCondor-users] Globus error 129 with large files
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

Re: [HTCondor-users] Globus error 129 with large files