[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Globus error 129 with large files


we are using Condor (8.4.3) for submitting jobs to remote clusters through Globus GRAM 5.2.

Lately we have noticed a lot of jobs switching to status Hold with the following reason:
- Globus error 129: the standard output/error size is different
We instructed users to properly define Output/Error files as described here:


However, errors continued. After some analysis we found out that the problem occurs when either Output, Error or one of transfer_output_files is very large (e.g. > XX GBs).

Here is a script that ends up in error state each time we run it:
grid_resource=gt5 grid.resource.hr/jobmanager-sge

Where run.sh script is the following:
dd if=/dev/zero of=./testmonkey bs=1M count=50000

We haven't yet performed more thorough analysis as we wanted to check with the list if this is some known issue. Please let me know if I can provide more info.

Thanks in advance
Emir Imamagic
SRCE - University of Zagreb University Computing Centre, www.srce.unizg.hr
Emir.Imamagic@xxxxxxx, tel: +385 1 616 5809, fax: +385 1 616 5559