[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Put jobs on hold if output or error files grow large?
- Date: Wed, 23 Jul 2014 14:21:23 -0500
- From: Brian Bockelman <bbockelm@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] Put jobs on hold if output or error files grow large?
Would MAX_TRANSFER_OUTPUT_MB be what you are looking for?
That places the job on hold if the final output files (all of them in aggregate) are above a certain size.
On Jul 23, 2014, at 2:03 PM, Carsten Aulbert <Carsten.Aulbert@xxxxxxxxxx> wrote:
> we have some dagman based pipelines which may or may not cause massive
> trouble depending on the input data set. The first sign of trouble is
> that enormous amounts of data are written to stderr and thus arrive in
> the file referenced by "err" in the submit file.
> Obviously, the correct choice would be to fix the programs somehow to
> detect this, but given this pipeline is (a) complex, (b) parts of it are
> ancient and (c) the exact location of the problem may also change on top
> of all complexity, I'm currently searching for an idea how to put these
> on hold after say the error log file is larger than 10 or 100MByte.
> We tried "Period_hold" first, but I'm not sure there is a way to check
> for file sizes there, but browsing through the manual did not reveal
> anything really matching.
> Has anyone ever tried this (or did I just miss an obvious way)?
> Dr. Carsten Aulbert - Max Planck Institute for Gravitational Physics
> Callinstrasse 38, 30167 Hannover, Germany
> phone/fax: +49 511 762-17185 / -17193
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> The archives can be found at: