[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Ever expanding Accountantnew.log



Aaaargh! How did I miss that?!

 

Thanks Greg, simple fix. At least one Greg knows what they’re doing! ;)

 

spool directory was owned by root:root so changed to condor:root and by the time

I looked Accountantnew.log had already been rotated.

 

04/06/17 13:40:07 About to rotate ClassAd log /home/condor/spool/Accountantnew.log

04/06/17 13:40:10 Accountant::UpdatePriorities - truncating database (prev size=16378964627)

04/06/17 13:40:10 Database has grown, expanding MAX_ACCOUNTANT_DATABASE_SIZE to 147750

 

# ll

total 1588

-rw------- 1 condor condor 120779 Apr  6 13:42 Accountantnew.log

-rw-r--r-- 1 condor condor 744905 Nov  3 18:20 history

-rw------- 1 condor condor 668019 Nov  3 18:20 job_queue.log

-rw------- 1 condor condor  53365 Nov  3 18:20 job_queue.log.1

-rw------- 1 condor condor    158 Nov  3 18:20 job_queue.log.4

drwxrwxrwt 2 condor condor   4096 Nov  3 18:20 local_univ_execute

-rw-r--r-- 1 condor condor     59 Nov  3 18:20 spool_version

 

# df -k

Filesystem     1K-blocks     Used Available Use% Mounted on

/dev/sda3       41274688   181800  38996248   1% /home/condor/spool

 

So Accountantnew.log has shrunk from 16Gb to 120Kb!

 

Thanks again.

 

Cheers

 

Greg

 

 

From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Greg Thain
Sent: Thursday, 6 April 2017 5:18 AM
To: htcondor-users@xxxxxxxxxxx
Subject: Re: [HTCondor-users] Ever expanding Accountantnew.log

 

On 04/05/2017 12:13 AM, Greg.Hitchen@xxxxxxxx wrote:

 

 

This is the relevant entry from NegotiatorLog (with NEGOTIATOR_DEBUG = D_MATCH D_ACCOUNTANT).

 

04/05/17 13:20:15 About to rotate ClassAd log /home/condor/spool/Accountantnew.log

04/05/17 13:20:15 failed to rotate log: safe_open_wrapper(/home/condor/spool/Accountantnew.log.tmp) returns -1


Greg:

When the condor_negotiator starts up, it compresses the accounting log by reading this transaction log file (which contains only diffs) into memory, then writes out a new file with one record per user.  It does this in the traditional, safe, unix way, by reading the old file, appending to a temp file named Accountantnew.log.tmp, and atomically renaming the .tmp to the working version.  the condor_negotiator should be running with the euid of condor during this processing.

On your machine, condor is failing to create the tmp file.  Perhaps the spool directory doesn't have write permission for the condor user or group?  If you su to condor on that machine, can you create a zero-length file in the spool directory with some random name like foo?

-greg