On 4/1/10 12:21 PM, Geary, Brian W. wrote:
I'm not a DB expert, but this does not sound like an obvious candidate for Condor. Instead, you should let your DB infrastructure (software + hardware + network) look after this for you.
Conditions under which Condor and a cluster may make sense:
1. The DB is static, so you only need read access to it, and it can be replicated to all cluster nodes.
2. The queries are "slow", and take on the order of minutes (or more) to complete
3. You will be executing a lot of these in parallel a lot of the time.
Depending on what you're trying achieve in your "joins", and whether you have a "real DB", or simply "data organized into a DB", you may make some good progress using some portion of the Hadoop projects map/reduce file-system/implementation and associated tools.
-- Ian Stokes-Rees, PhD W: http://hkl.hms.harvard.edu ijstokes@xxxxxxxxxxxxxxxxxxx T: +1 617 432-5608 x75 NEBioGrid, Harvard Medical School C: +1 617 331-5993
begin:vcard fn:Ian Stokes-Rees, PhD n:Stokes-Rees;Ian org:Harvard Medical School;Biological Chemistry and Molecular Pharmacology adr;dom:;;250 Longwood Ave;Boston;MA;02115 email;internet:ijstokes@xxxxxxxxxxxxxxxxxxx title:Research Associate, Sliz Lab tel;work:+1 617 432-5608 x75 tel;fax:+1 617 432-5600 tel;cell:+1 617 331-5993 url:http://hkl.hms.harvard.edu version:2.1 end:vcard