[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] DAG of DAGs



> Even if the vars usage you want worked at the DAG level, I don't think
> this would get you what you want.  If you're queuing 150 jobs, all
> with the same value for $(procid), all of your jobs will write to the
> same output file, which will mean that you'll only actually get the
> output back for whichever one finishes last; the earlier ones will be
> overwritten by the last one. 

I have recently been thinking about a similar situation, regarding
merged STDOUT.  We are already doing something like this with a per-node
results file that gets merged into a per-DAG aggregated results file
simply by "cat node-XXXX-results.dat >> dag-results.dat" in the node
POST script, and we haven't had any problems with concurrent IO (we can
have 30-50 nodes finishing a second) -- I was worried this might be a
problem.

To merge STDOUT into a single file for the whole DAG using node POST
scripts may be more problematic -- STDOUT can be thousands of lines
long, and 100-200 KB.  That would be adding 10 MB a second (with a node
completion rate of 50 Hz) to the merged file.  Even if bash properly
synchronized and locked the output file descriptor, we may end up with a
backlog that would cause a load spike (just from process IO
blocking/contention).

For the moment, we only merge the results at the end of the DAG.

Perspectives on our proposed STDOUT merge via POST script would be
appreciated.

Regards,

Ian
begin:vcard
fn:Ian Stokes-Rees, PhD
n:Stokes-Rees;Ian
org:Harvard Medical School;Biological Chemistry and Molecular Pharmacology
adr:250 Longwood Ave;;SGM-105;Boston;MA;02115;USA
email;internet:ijstokes@xxxxxxxxxxxxxxxxxxx
title:Research Associate, Sliz Lab
tel;work:+1.617.432.5608 x75
tel;fax:+1.617.432.5600
tel;cell:+1.617.331.5993
url:http:/sbgrid.org
version:2.1
end:vcard