[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Condor-users] Resending: Condor Configuration Management RFC



I wrote:

> >We at Optena would like to start a discussion around Condor
> >configuration management.

Alain wrote:

> Have you looked at existing tools like Gridconfig?
> 
> http://rocks.npaci.edu/gridconfig/

I hadn't heard of that; thanks for the link.  Based on a cursory glance,
one issue may be that it's very Unix-centric.  It's written in Python,
and (although I can't find this in the document) I'll bet it uses Unix
tools to push out the configuration information.  Condor has lots of
Windows users.

That aside, however, I do like the fact that it's generic and extensible
enough to manage many different applications.  This is something we
should seriously look into.

David McBride writes:

> Hi,
> 
> I've reviewed the proposal.

Thank you for taking the time to have a look and post a thoughtful
reply.  

I agree with several of your comments, but I also think that we share a
pro-Unix bias.  Your comments are also predicated on the existence of a
shared file system.  The default behavior on Windows is that there ISN'T
a shared file system, although this can be dealt with.  As well,
machines located in physically different areas often don't share a file
system.

More comments are inline below.

> At it's heart, it appears to suggest the introduction of the following

> features:
> 
> - Using a RDBMS to store live configuration data.
> - Exporting the RDBMS contents from a new Configuration server.
> - Graphical and commandline tools for updating the configuration
server.
> - Arrange the configuration data to allow hierarchical configurations,
> eg:
> 	Common:		Site -> Pool -> Host
> 	Unix-specific:	Site -> Pool -> Host
> 	...
> 
> These are interesting ideas.  I would make the following comments:
> 
> - Using a RDBMS is (probably) overkill unless you've got a really huge
> set of hosts.  Database systems really come into their own when you
need
> to be able to make a (large) number of changes to a datastore whilst
> maintaining transactional consistency.  Certainly in the 400-500 node
> pool that I maintain, updating flat files and running `condor_reconfig
> -all` is sufficient.

Fair enough, but I should also point out that the database can be used
for more than configuration information when managing a Condor pool.
When that's added, it's less overkill. 

> - The introduction of a RDBMS introduces a new question: change
> management.  With flat files, existing revision control systems --
RCS,
> CVS, SVN, Arch, or whatever is locally appropriate -- can be used to
> record changes and manage rollbacks of configurations as required.
> Whilst a RDBMS can certainly *provide* atomic snapshots at moments in
> time, this functionality is usually not automatic and some additional
> facilities would be needed to implement good revision control.

Agreed.

> - A graphical utility to aid someone unfamiliar with Condor with
system
> configuration could be useful to those who are unfamiliar with Condor.
> Typically, I find the comments in the standard template configuration
> file (and where I'm confused, the manual) to be sufficient.

To each his/her own. :-)

> - The hierarchical configuration concept is a vital one for even
> moderate-sized pools.  I already use a slightly less general
> hierarchical system than the one you describe using the existing
Condor
> facilities:

This is good stuff.

> To conclude, my preference would be to keep the existing Condor
> flat-files mechanisms for the live pool configuration.
> 
> If I had a very large number of nodes to manage, I would store a
> hierarchy of configuration options in some form of template or
database,
> and then using glue scripts to *generate* from that the Condor
> configuration files.

This is essentially what we do.  We generate a file that Condor reads;
just like your solution, that file needn't be read/written by anyone
other than the tool that generates it and the condor daemons.

> But this is a "push" mechanism rather than a "pull", which in practive
> has been more reliable in other systems where we have adopted this
> approach.  For example, our live DNS, DHCP, SMTP and other service
> configurations are all generated from databases in this manner.

Upon the very first initial startup, there is a pull from the server.
But after that, we can use both push and pull, and anyway we always have
the previous copy of the configuration to fall back upon.

> Hope this helps.

It does, and thanks again for your reply.

Mike Yoder
Principal Member of Technical Staff
Ask Mike: http://docs.optena.com
Direct  : +1.408.321.9000
Fax     : +1.408.321.9030
Mobile  : +1.408.497.7597
yoderm@xxxxxxxxxx

Optena Corporation
2860 Zanker Road, Suite 201
San Jose, CA 95134
http://www.optena.com