[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor feature suggestion: automatically compressed output files

On 7/30/06, Alex Gontmakher <gsasha@xxxxxxxxxxxxxxxxx> wrote:
Oh, checkpoint is a problem indeed, didn't think of that (but can't the state
of the compression algorithm be checkpointed as well?)

I wouldn't want to be the person supporting this.

Oh, my suggestion was to analyze the extension of the output file and use a
corresponding engine, i.e., use bzip2 if the file name ends with ".bz2" etc.
This way, the user gets a say on what engine to use, and the decompression
programs automatically will recognize the file.

a fair point, but then which ones to include - this all adds
additional complexity that I don't think condor needs (my personal
opinion of course)

Er??? I can't comment on the amount of recoding necessary - or at least, my
estimate is guesswork as I don't really know the architecture of Condor, but
I don't think my proposed solution is that unflexible...
> Admittedly this is at the cost of  one additional line/ entry in the
> submit script to transfer the 'real' exe as well as the script') and
> of course the lack of standard universe (but as we mentioned before
> this has complications if you do that anyways)
> For the standard universe there is a reasonable likelihood that, if
> you can relink the app you can prob also change it to output
> differently.
Well then, how do you propose changing a C program to compress its standard

I'm no C programmer except when absolutely necessary but google and
some looking gets me

to get a file (note provisos) so no streaming compression. (I concur
with the suggested 'best' behaviour in the next answer of ending all
output calls to your own redirectable function)

Seems like it is trying to solve your problem amongst others

> The only really big saving you would get in complexity is if the job
> is cross platform, then wrapping in a script means creating multiple
> scripts each doing it differently. I admit this sounds nice but I
Well, our cluster happens to be cross platform indeed - a mix of Intel
machines and PowerPC blades... and the code I run on it is compiled and works
on both.

I meant cross platform in terms of OS (since the std io streams'
behaviour are as I understand it more OS specific than architecture

I''m not saying I don't think this is a good idea in general, I'm just
saying layering more complexity into condor's existing io
functionality when the functionality can be largely gained via
existing mechanisms is not always a good idea...