[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] change to condor_submit - user feedback desired! (was Re: multiple condor_submit's - one cluster)



Is it really much different than going
  ls *.csv| grep foo| xargs "inputline={} condor_submit domystuff.cfg &"
and using $ENV(inputline) to retrieve the value of inputline?

input = $ENV(inputline)
output = $ENV(inputline).output
queue 

Doing it within condor would make more sense if it significantly reduced the "latency" (for want of a better term) so that the submit rate increases (as Donald Krieger said).

Klint.


-----Original Message-----
From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Todd Tannenbaum
Sent: Saturday, 7 February 2015 3:00 AM
To: HTCondor-Users Mail List
Subject: [HTCondor-users] change to condor_submit - user feedback desired! (was Re: multiple condor_submit's - one cluster)

On 2/6/2015 9:04 AM, Krieger, Donald N. wrote:
> Hi Todd,
>
> Thanks for posting back with the answers. It's very helpful.
>
> My problem is that the names I'm using for the log files step through 
> a sequence of strings rather than a sequence of numbers. And I have a 
> downstream management routine that uses those names which I would have 
> to alter if I changed the naming sequence.

Hi Don,

Thanks for the above explanation.  Indeed, $(Process) is great in your submit files, but only if your data files are numerically sequenced!

Perhaps we could create a general solution in condor_submit that addresses the above and yet would still a) allow all the submits to happen at once into one cluster and b) not require the user to know how to write scripts.

I listed a couple brainstorm ideas below that would be relatively easy to implement.  Do folks think that either of the below be helpful? Would love to hear any feedback, alternative ideas.

regards,
Todd

Some brainstorm ideas:

1. A "queue foreach <filepattern>" command.  Folks could then have submit files that look like this:

     input = $(file)
     output = $(file).output
     queue foreach data/*.csv

So for each file in subdir data that ends in .csv, a job would be submitted and $(file) would expand to the path to the file.

and/or

2. A command line option to condor_submit that tells it to read stdin, and to do a submit for each stdin line, substituting each line from stdin with $(input_line).  Folks could then have submit files that look like this:

    input = $(input_line)
    output = $(input_line)
    queue

and invoke condor_submit via lines like:

    ls data/*.csv | grep foo | condor_submit -submit_per_line

Comments? Other ideas?
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/