[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Network traffic and data storage recommendations [SEC=UNCLASSIFIED]



UNCLASSIFIED

Hi Steven, thanks for your reply. At the moment I have 5 machines in my
pool, but it is going to grow.

-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Steven Timm
Sent: Thursday, 13 January 2011 3:35 PM
To: Condor-Users Mail List
Subject: Re: [Condor-users] Network traffic and data storage
recommendations [SEC=UNCLASSIFIED]

On Thu, 13 Jan 2011, Lenou, Peter (Contractor) wrote:

> UNCLASSIFIED
>
> Hi,
>
>
>
> I am new to Condor and was wondering what schemes people in the Condor

> community use to manage the amount of data and network traffic their 
> jobs produce.
>
>
The key thing you haven't told us here is how big is your condor pool
and how many of these 10,000,000 jobs would be running simultaneously.
In general you would not want to use the default condor transfer file
mechanisms to transfer 10M 1Gb files.  If I remember correctly the new
development version of condor is going to allow a pluggable file
transfer mechanism so as to specify what mechanism you use to transfer
the files. i.e. gridftp, hadoop, http, etc.


>
> For example, my Condor requirements are that I submit a job specifying

> an input and output file via command line arguments i.e.
> MyExecutable.exe -in inputFile -out outputFile. In extreme cases, each

> batch may contain 10 million jobs, each creating an output file 1GB in

> size.
>
> I would want the output files to be transferred back to the submit 
> machine as the jobs complete in order to limit network traffic, and 
> the submit machines won't have a large amount of direct attached 
> storage so each execution machine in the pool would have direct 
> attached storage for the output files. I will then have a daemon to 
> bring all the output files together to a central location for
analysis.
>

It would have to be some daemon...that is a tall order.
>
>
> Does this sound like a feasible solution?
>
>
>
> Is there a better solution and how would this be implemented i.e.
> network architecture, ClassAds etc?
>
>

Since you say all your execution machines have direct attached storage
you might seriously consider using Hadoop and HDFS.
That's a cheap way to make commodity computers be able to share large
amounts of block data.. and if you go that step, you might also consider
MapReduce or some other product like it to do the bringing together and
concatenation of the files.  Every execute node would be serving some of
your data and thus you have a many-to-many network problem instead of a
many-to-one.

Also--I doubt very much if you could put 10 million jobs into a single
condor cluster.  DAGman is your friend here, and maybe some of the
higher-level technologies that use it such as Pegasus or Swift.

Steve Timm



>
> How do other users in the Condor community deal with large data files 
> and network traffic?
>
>
>
> PL
>
>
>
> IMPORTANT: This email remains the property of the Department of 
> Defence and is subject to the jurisdiction of section 70 of the Crimes
Act 1914.
> If you have received this email in error, you are requested to contact

> the sender and delete the email.
>
>
>

--
------------------------------------------------------------------
Steven C. Timm, Ph.D  (630) 840-8525
timm@xxxxxxxx  http://home.fnal.gov/~timm/ Fermilab Computing Division,
Scientific Computing Facilities, Grid Facilities Department, FermiGrid
Services Group, Group Leader.
Lead of FermiCloud project.
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with
a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/

IMPORTANT: This email remains the property of the Department of Defence
and is subject to the jurisdiction of section 70 of the Crimes Act 1914.
If you have received this email in error, you are requested to contact
the sender and delete the email.