[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] RemoteWallClockTime vs CommittedTime


> We have noticed a problem in collecting accounting data from the HTCondor
> classads.  We are seeing situations where CPU is exceeding Wall time.
> We use the RemoteWallClockTime classad as the basis of Wall time.  According
> to the documentation, this appears to be the correct one to use.  The accounting
> system also captures CommittedTime.   We are seeing conditions where
> CommittedTime exceeds RemoteWallClockTime.  One of many cases....
>  CommittedTime = 3944     RemoteWallClockTime = 1   Total CPU = 1935
> Based on the documentation, if I am interpreting it correctly, CommittedTime
> should never exceed RemoteWallClockTime since CommittedTime can get reset to
> zero if evicted w/o a checkpoint.  And RemoteWallClockTime does not.
> I am trying to understand under what conditions this can occur.
> It is making no sense to us.

Is this happening while the jobs are actively running?  Because the
RemoteWallClockTime returned from condor_q is only accurate when the
job is not running.

I have jobs running now with multiple hours of CommittedTime, but with
RemoteWallClockTime still zero.  If evicted, the RemoteWallClockTime
is updated.