[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Is it a good idea to use condor C++ log class to monitor jobs?



On 4/3/2013 5:07 AM, 钱晓明 wrote:
I have to build a wrapper layer to monitor condor jobs. I hope the
status of jobs can be reported to a remote GUI program as soon as possible.

So I want to build a daemon using condor C++ log reader class to read
job events periodically in submit nodes and send them to the monitor
program.

I want to know if this is a good idea and if not, what should I do?



Sounds like a perfectly reasonable approach to me - this is what the event log and the log reader class is intended for. While it is not hard to roll your own code to read the event log, one advantage of the C++ reader class is it deals w/ thing like log rotations, missing events, etc. Also be aware that you can create an event log not only per job (by adding "log = filename" in your condor_submit file), but you can configure the schedd to write out a single (potentially rotating) event log that contains events across all jobs from all users on that schedd by putting "EVENT_LOG = filename" in your condor_config file. This is probably what you want to do for the above application.

Some additional details below.

hope this helps,
Todd

++ Documentation on the C++ reader class:

http://research.cs.wisc.edu/htcondor/manual/v7.9/4_5Application_Program.html#SECTION00553000000000000000

++ Documentation on the various condor_config knobs related to the event log (cut-n-pasted from the manual):

The following macros control where and what is written to the event log, a file that receives job user log events, but across all users and user's jobs.

EVENT_LOG
The full path and file name of the event log. There is no default value for this variable, so no event log will be written, if not defined.

EVENT_LOG_MAX_SIZE
Controls the maximum length in bytes to which the event log will be allowed to grow. The log file will grow to the specified length, then be saved to a file with the suffix .old. The .old files are overwritten each time the log is saved. A value of 0 specifies that the file may grow without bounds (and disables rotation). The default is 1 Mbyte. For backwards compatibility, MAX_EVENT_LOG will be used if EVENT_LOG_MAX_SIZE is not defined. If EVENT_LOG is not defined, this parameter has no effect.

MAX_EVENT_LOG
    See EVENT_LOG_MAX_SIZE.

EVENT_LOG_MAX_ROTATIONS
Controls the maximum number of rotations of the event log that will be stored. If this value is 1 (the default), the event log will be rotated to a ``.old'' file as described above. However, if this is greater than 1, then multiple rotation files will be stores, up to EVENT_LOG_MAX_ROTATIONS of them. These files will be named, instead of the ``.old'' suffix, ``.1'', ``.2'', with the ``.1'' being the most recent rotation. This is an integer parameter with a default value of 1. If EVENT_LOG is not defined, or if EVENT_LOG_MAX_SIZE has a value of 0 (which disables event log rotation), this parameter has no effect.

EVENT_LOG_ROTATION_LOCK
Controls the lock file that will be used to ensure that, when rotating files, the rotation is done by a single process. This is a string parameter; it's default value is the file path of the event log itself, with a ``.lock'' appended. If EVENT_LOG is not defined, or if EVENT_LOG_MAX_SIZE has a value of 0 (which disables event log rotation), this parameter has no effect.

EVENT_LOG_FSYNC
A boolean value that controls whether HTCondor will perform an fsync() after writing each event to the event log. When True, an fsync() operation is performed after each event. This fsync() operation forces the operating system to synchronize the updates to the event log to the disk, but can negatively affect the performance of the system. Defaults to False.

EVENT_LOG_LOCKING
A boolean value that defaults to True. When True, the event log (as specified by EVENT_LOG) will be locked before being written to. When False, HTCondor does not lock the file before writing.

EVENT_LOG_USE_XML
A boolean value that defaults to False. When True, events are logged in XML format. If EVENT_LOG is not defined, this parameter has no effect.

EVENT_LOG_JOB_AD_INFORMATION_ATTRS
A comma separated list of job ClassAd attributes, whose evaluated values form a new event, the JobAdInformationEvent, given Event Number 028. This new event is placed in the event log in addition to each logged event. If EVENT_LOG is not defined, this configuration variable has no effect. This configuration variable is the same as the job ClassAd attribute JobAdInformationAttrs (see page [*]), but it applies to the system Event Log rather than the user job log.