Mailing List Archives
Public Access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] change to condor_submit - user feedback desired! (was Re: multiple condor_submit's - one cluster)
- Date: Fri, 6 Feb 2015 10:40:45 -0600
- From: David Champion <dgc@xxxxxxxxxxxx>
- Subject: Re: [HTCondor-users] change to condor_submit - user feedback desired! (was Re: multiple condor_submit's - one cluster)
Todd -
* On 06 Feb 2015, Todd Tannenbaum wrote:
>
> 1. A "queue foreach <filepattern>" command. Folks could then have submit
> files that look like this:
>
> input = $(file)
> output = $(file).output
> queue foreach data/*.csv
>
> So for each file in subdir data that ends in .csv, a job would be submitted
> and $(file) would expand to the path to the file.
I like the general idea. Maybe "queue for file in data/*.csv" instead,
to allow user to identify the variable (and match common syntax).
Some more ideas below that go beyond this though.
> 2. A command line option to condor_submit that tells it to read stdin, and
> to do a submit for each stdin line, substituting each line from stdin with
> $(input_line). Folks could then have submit files that look like this:
>
> input = $(input_line)
> output = $(input_line)
> queue
>
> and invoke condor_submit via lines like:
>
> ls data/*.csv | grep foo | condor_submit -submit_per_line
Also an interesting idea. Again, maybe allow user to specify macro name?
ls data/*.csv | grep foo | condor_submit -submit_per_line input_line
If hardcoding the name, perhaps "$(stdin)" rather than "$(input_line)".
Two other thoughts to throw out there, riffing off this basic
need/interest:
1. What about a more general looping capability? This adds
another concept (`cmd`), but is something that I'm exploring with
a condor_submit wrapper. (You can never have enough condor_submit
wrappers, it seems.) I don't care much about the specific syntax,
just an illustration:
for file in `ls data/*.csv`
input = $(file)
output = $(file).output
queue
end
You've probably gotten this before, and I don't know what the issues
are, so feel free just to say "we already decided not to do this."
2. Make a macro that always reads one more line from stdin at the time
it's evaluated. And make a queue variation that queues until some
condition is true.
# read a line
current_file = $(stdin)
# apply that line to two other settings
input = $(current_file).in
output = $(current_file).out
# keep going until no more lines (or blank line)
queue until $(current_file) == ""
I'm not sure this is a complete concept but maybe you get the idea.
3. Finally, as to the specific syntax of $(stdin) (or $(input_line),
whatever): Maybe it makes sense to create a general $(<name) notation,
where name identifies a file (or fd, in limited cases like stdin) to
read from. Then you can read the list of parameters from a static file
in addition to reading from stdin.
I think that with #1 or #3 (#2 beside the point) there's no need for a
new command line option.
Sorry for the length. I can do this all day but at some point we all
need to work. :)
--
David Champion â dgc@xxxxxxxxxxxx â University of Chicago â OSG Connect