[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Sample parameter sweep application


I stuck my foot in my mouth there, so to make up for it, let me help you however I can.  I have a nice and relevant application which I am running right now, and which will give you some good graphics as output.  It does protein model molecular replacement, and there are a number of different parameters that can be varied through a command file.  You need to download and compile the application from here:


then I can give you a set of data files or you can wget them from here:


and you need one "MTZ" file, for example:


Then you want to run the following loops:

1. Loop over every file in the "biodb/balbes" directory.

2. For each file, use the base input parameter file "molrep.cmd":

_SIM 1.0
_COMPL 0.5
_NP     10
_NPT    10

3. you execute the application by the command:

cat molrep.cmd | molrep HKLIN TGase.mtz MODEL 2oqo.pdb

4. then repeat for the following variations:
and vary _RESMAX in intervals of 1.5 starting at 8.0 and lowering to 3.5
and vary _SIM in intervals of 0.1 from 0.0 to 1.0
and vary _COMPL in intervals of 0.1 from 0.0 to 1.0

Let me know if you need any more suggestions on how to proceed with this.  I'd be very interested to see your results!  Each invocation should take 10-30 seconds, so you may want to have a more coarse grid and user fewer input files.



Matias Alberto Gavinowich wrote:

I am working on a software capable of handling parameter sweep
application submissions on a cluster or grid environment (for the time
being, I am using Condor), for my thesis. I am looking for sample
applications which I may use to test the software I've developed. I
would appreciate your help and suggestions, perhaps someone in the
community can point me to some application which meets my needs.

What I am looking for is a non-data intensive application for which it
may be desired to perform several runs varying a parameter. This
parameter is an input file. For instance, something like a program
which finds the roots of a mathematical function, which I would use

find_roots range.txt

Range would contain something like 1-10000, and I would submit several
ranges, so each run can take place in a separate computer and while
one processes a range, the other processes another range.

The program can produce output on the standard output as well as on
another file on disk, or both. The key is that both find_roots and
range.txt are files of relatively small size, and that the benefit of
running each range on a separate computer is that overall speed is
increased. For demonstration purposes, this program should take a bit
to run, but I would like to be able to make a couple of runs fit into
a short demo. It's purpose and produced output should also be
relatively easy to understand by an audience new to it.

I have been looking for real-world (or near real-world) applications
that resemble what I've just described, but the ones I found require
large databases to be present, require several input files instead of
one, the files involved are very large, etc.

If anyone can point me to an application which works like I described,
I'd greatly appreciate it. Availability of source code is a plus, but
not essential.



Ian Stokes-Rees                            W: http://sbgrid.org
ijstokes@xxxxxxxxxxxxxxxxxxx               T: +1 617 418-4168
SBGrid, Harvard Medical School             F: +1 617 432-5600

fn:Ian Stokes-Rees
org:Harvard Medical School;SBGrid
adr;dom:;;250 Longwood Ave;Boston;MA;02115
title:Research Associate
tel;work:+1 617 432-5608 x75
tel;fax:+1 617 432-5600
tel;cell:+1 617 331-5993