[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Network traffic and data storage recommendations [SEC=UNCLASSIFIED]



UNCLASSIFIED

Hi,

 

I am new to Condor and was wondering what schemes people in the Condor community use to manage the amount of data and network traffic their jobs produce.

 

For example, my Condor requirements are that I submit a job specifying an input and output file via command line arguments i.e. MyExecutable.exe -in inputFile -out outputFile. In extreme cases, each batch may contain 10 million jobs, each creating an output file 1GB in size. I would want the output files to be transferred back to the submit machine as the jobs complete in order to limit network traffic, and the submit machines won't have a large amount of direct attached storage so each execution machine in the pool would have direct attached storage for the output files. I will then have a daemon to bring all the output files together to a central location for analysis.

 

Does this sound like a feasible solution?

 

Is there a better solution and how would this be implemented i.e. network architecture, ClassAds etc?

 

How do other users in the Condor community deal with large data files and network traffic?

 

PL

 

IMPORTANT: This email remains the property of the Department of Defence and is subject to the jurisdiction of section 70 of the Crimes Act 1914. If you have received this email in error, you are requested to contact the sender and delete the email.