Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Request for Ideas/Plans: Designing a Large Condor Pool

Date: Thu, 25 May 2006 08:06:04 -0400
From: Jess Cannata <jac67@xxxxxxxxxxxxxx>
Subject: Re: [Condor-users] Request for Ideas/Plans: Designing a Large Condor Pool

Yes, we definitely do not want to rely on only one schedd, and weprobably do not want to rely on one collector and negotiator, either. Wealso have the challenge of getting the output back to the user. We havea few ideas on how to do this, but we'd first like to hear from thegroups that are already doing this; apparently other people on the listare interested, too.

When we were at CondorWeek last year we brought up the need for somesample deployment diagrams for large pools, and there seemed to be a lotof interest in this. Please send me simple diagrams (or explanations) oryour pools, even if you don't think that your pool is all thatinteresting. You can send them as Visio, PowerPoint, or other formats(just tell me what they are), and I will compile them and see aboutgetting them either included in the Condor manual or in some othersection on the web site.

You can send the files with attachments directly to me and I will makesure the information is pushed back out the list.


Jess Cannata
Advanced Research Computing
Georgetown University
jac67@xxxxxxxxxxxxxx


Matt Hope wrote:

On 5/24/06, Jess Cannata <jac67@xxxxxxxxxxxxxx> wrote:

Dear Condor User Community:

We are in the process of setting up a Condor pool to initially include
all lab machines (400-1000 machines) on campus, though later we plan to
add a few of our clusters. While we currently run Condor on some of our
smaller clusters, we suspect that the layout for this larger pool will
be different than a standard Condor pool.

For this campus pool, we want one entry point for users to submit jobs.
Since the pool will have tens of thousands of jobs in queue, with
several hundreds running simultaneously, we know that we will likely
overload one schedd along with the other daemons.

Does anyone have any design plans that outline how one might set up a
pool with a single point of entry, with multiple daemons to spread out
the load and provide some redundancy? I've looked in the manual for
examples of large deployments, but cannot find any. Am I missing
something? If you wouldn't mind sharing your pool layout, I think that
this would be useful to many Condor users especially if your pool is not
a typical pool.


For a large pool having all the schedd's on one machine is a very bad
idea. since it dying (or needing serivcing) will screw the whole farm.

Having your controlled submit point automatically trigger submmisions
to some distribution of schedd's is the best idea. This way you have
to wrap the submission tools (or use the sOAP library) but this may be
a benefit since you also gain total control over what submit options
are set.

Dealing with getting data back is the tricky bit in this case. If you
have some form of network file system you can just get the jobs to
direct their output there but if not you will need some means of
bringing back the resulting files from the disk on the schedd's
machine.

This is only rough but gives you a flavour for what you can do.

Matt

_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

References:
- [Condor-users] Request for Ideas/Plans: Designing a Large Condor Pool
  - From: Jess Cannata
- Re: [Condor-users] Request for Ideas/Plans: Designing a Large Condor Pool
  - From: Matt Hope

Prev by Date: [Condor-users] compiling CNS with Condor
Next by Date: Re: [Condor-users] Unix file path as DAGMan variable
Previous by thread: [Condor-users] Security concern in CONDOR
Next by thread: [Condor-users] Orphaned jobs?
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

Re: [Condor-users] Request for Ideas/Plans: Designing a Large Condor Pool