[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [condor-users] Can I replace the Condor server in a live pool ?



On Wed, Oct 08, 2003 at 08:17:25AM -0500, Alain Roy wrote:
> 
> Yes, there will be no problems, you just won't start up new jobs while 
> there isn't a central manager.
> 
> >Just by replacing the old server with the new will the nodes - both 
> >submitting and executing - cope, or do I need to refig them?
> 
> As long as the name/ip hasn't changed, you won't have a problem. Otherwise, 
> you'll need to edit your configuration file(s) to point at the new server 
> and do a condor_reconfig on the nodes.
> 
> You will probably notice that it will take about five minute for 
> "condor_status" to regain all of knowledge it used to have about the pool, 
> but it will happen automatically.
> 
> >Will running jobs, queued jobs, etc continue as normal, or recover when 
> >the matchmaking starts again
> 
> Running jobs will continue running. Queued jobs will remain queued. If jobs 
> are interrupted during the outage, they will be queued then restarted when 
> the central manager returns.
> 
> >Or do I need to copy some server data files across to ensure continued 
> >service?
> 
> There is no need to do anything like that.
> 

Actually - there are some files that you may want to copy over - on the central
manager, you probably want to save your Accountant.log and Accountantnew.log in
the $(SPOOL) directory on your central manager.
These are the files where the priority of the all of currently running users
are found. There's nothing specific in those files about what central
manager they came from, so the easiest way to migrate is to shut down the
old CM, copy those files to the new CM, and then turn the new CM.

If you're not worried about user priorities, you don't need to copy them 
to the new machine. If you don't copy them, there may be a bit of preemption
in your pool for the first few rounds of matchmaking as the negoitiator
tries to figure out what's going on.

-Erik

> -alain
> 
> 
> Condor Support Information:
> http://www.cs.wisc.edu/condor/condor-support/
> To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
> unsubscribe condor-users <your_email_address>
> 
Condor Support Information:
http://www.cs.wisc.edu/condor/condor-support/
To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
unsubscribe condor-users <your_email_address>