[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Reproduce a new history file



Hi Todd,

Thanks for you suggestions! I'll try both of them.

Some updates:
Around ten minutes passed after touching a new history file, the completed jobs were automatically recorded in history file again.
I'm wandering that why the new job cannot be recorded in time, even though many jobs are completing.

Besides, comparing "restarting condor service on schedd" with "kill the schedd process", in the case of the latter one, schedd process will be restarted by Master process, does it mean all the shadow processes will reconnect to the new schedd process and is it a similar recovery procedure with "restarting condor service"? (we ever experienced losing some shadow processes via restarting condor service a year ago, it has been fixed with you guys' help.)

Thanks!
Cheers,
Xiaowei

> -----ååéä-----
> åää: "Todd Tannenbaum" <tannenba@xxxxxxxxxxx>
> åéæé: 2019-05-23 23:04:08 (ææå)
> æää: "HTCondor-Users Mail List" <htcondor-users@xxxxxxxxxxx>, "JIANG Xiaowei" <jiangxw@xxxxxxxxxx>
> æé: 
> äé: Re: [HTCondor-users] Reproduce a new history file
> 
> On 5/22/2019 10:31 PM, JIANG Xiaowei wrote:
> > Hi All,
> > 
> > 
> > I met a tiny problem:
> > 
> > The history file(/var/lib/condor/spool/history) was removed by accident.
> > 
> > I touched a new history file with "touch" command and changed the owner 
> > to condor:condor, but no new job history was written in.
> > 
> > Except restarting schedd, is there a better way to reproduce a new 
> > history file and to make the job history written in?
> > 
> 
> Hi Xiaowei,
> 
> I think just doing a condor_reconfig on the schedd will be sufficient. 
> If you are using the default configuration, you should be able to login 
> to your central manager and do
>    condor_reconfig <name of machine running the schedd>
> 
> Alternatively, if you have root on the machine running the schedd, you 
> can send it a HUP signal to achieve the same thing.  E.g. as root do
> 
>     kill -HUP <pid_of_your_schedd>
> 
> hope it helps,
> Todd
> 
>