Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Globus error 129 with large files

Date: Fri, 11 Mar 2016 21:22:47 +0100
From: Emir Imamagic <eimamagi@xxxxxxx>
Subject: [HTCondor-users] Globus error 129 with large files

Hello,

we are using Condor (8.4.3) for submitting jobs to remote clustersthrough Globus GRAM 5.2.

Lately we have noticed a lot of jobs switching to status Hold with thefollowing reason:

- Globus error 129: the standard output/error size is different
We instructed users to properly define Output/Error files as described here:

https://twiki.opensciencegrid.org/bin/view/Documentation/Release3/GlobusError129

However, errors continued. After some analysis we found out that theproblem occurs when either Output, Error or one of transfer_output_filesis very large (e.g. > XX GBs).


Here is a script that ends up in error state each time we run it:
Executable=run.sh
transfer_output_files=testmonkey
Output=run.$(Cluster).out
Error=run.$(Cluster).err
universe=grid
grid_resource=gt5 grid.resource.hr/jobmanager-sge
queue

Where run.sh script is the following:
#!/bin/sh
dd if=/dev/zero of=./testmonkey bs=1M count=50000

We haven't yet performed more thorough analysis as we wanted to checkwith the list if this is some known issue. Please let me know if I canprovide more info.


Thanks in advance
--
Emir Imamagic
SRCE - University of Zagreb University Computing Centre, www.srce.unizg.hr
Emir.Imamagic@xxxxxxx, tel: +385 1 616 5809, fax: +385 1 616 5559

Follow-Ups:
- Re: [HTCondor-users] Globus error 129 with large files
  - From: Brian Bockelman

Prev by Date: Re: [HTCondor-users] Questions about and problems with condor_kbdd.exe on WIndows 7.
Next by Date: Re: [HTCondor-users] Globus error 129 with large files
Previous by thread: Re: [HTCondor-users] Questions about and problems with condor_kbdd.exe on WIndows 7.
Next by thread: Re: [HTCondor-users] Globus error 129 with large files
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

[HTCondor-users] Globus error 129 with large files