[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Condor-users] Resending: Condor Configuration Management RFC


I am very pleased to see this exchange of ideas, requirements and opinions on the Condor-users list. I believe that all of us - users and developers - will benefit from an open discussion of the value of existing and/or planned capabilities of Condor or related software tools.

Thank you for being part of the Condor community and for helping us to enhance its capabilities and effectiveness.


 At 07:55 PM 9/16/2005, you wrote:
Content-class: urn:content-classes:message
Content-Type: multipart/alternative;

Thank you so much for your very thoughtful response. My comments are included below:
> These are interesting ideas.  I would make the following comments:
> - Using a RDBMS is (probably) overkill unless you've got a really huge
> set of hosts.  Database systems really come into their own when you need
> to be able to make a (large) number of changes to a datastore whilst
> maintaining transactional consistency.  Certainly in the 400-500 node
> pool that I maintain, updating flat files and running `condor_reconfig
> -all` is sufficient.
I definitely agree that a relational database is overkill when your configuration files are infrequently updated and well-tuned to your needs.
Given the very flexible and highly configurable nature of Condor system, there are many wonderful things one can do with Condor by dynamically adding new resource attributes, or dynamically changing Condor?s policy expressions. In large companies, machines are typically shared across different groups, and each group owns their machines and hence has some unique set of policies and settings.  Sorting out and remembering which local config files contain which policies can lead to management headaches.  A central database can help with that.
A database can open up some possibilities that you may not have considered before.  Let's say that a central database makes it easy to change Condor's policy expressions (START, PREEMPT, RANK, etc) for arbitrary groups of machines.  Now, let's also say that your boss walks in and wants 50 machines for his exclusive use RIGHT NOW.  Problem solved: it's easy to just change the START _expression_ for 50 machines in your central database. 
To take this a step further, what if Condor's policy expressions could change _automatically_ in response to some event (or events)?  To give an example, you could set up a ?rule? to change the policies of a pool when it is highly loaded.  Another ?rule? could exist to change pool policies when certain throughput requirements aren't being met by an important group in your company.  This is just the tip of the iceberg.
In order to respond like this, however, we need to capture more information into the database. It needs to essentially contain the state of the entire pool - all the machine ads, all the job ads, historical job performance, information on running daemons, and more.  Handling all this data demands a powerful database.  But once all this information is available and centrally located, it becomes possible to analyze, visualize, and even troubleshoot Condor.
I have spent quite a bit of time and energy to envision and then develop a way to automate and manage Condor.  The concepts above are central to ongoing work at Optena. I welcome further your discussion and exchange of ideas. 
Surendra Reddy
Founder & CTO
Direct : +1.408.321.9006
Fax    : +1.408.904.5992
Mobile: +1.408.203.0077
Optena Corporation
2860 Zanker Road, Suite 201
San Jose, CA 95134
This electronic transmission (and any attached documents) contains information from Optena Corporation and is for the sole use of the individual or entity it is addressed to.  If you receive this message in error, please notify me and destroy the attached message (and all attached documents) immediately.
Condor-users mailing list