[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] logstash grok for XML Event Log



Hi again,

with significant help from the Logstash forum [1], attached is a
Logstash grok, that should be usable to parse Condor or CondroCE event
logs, that got written as XML, to Elastic Search etc.

Cheers,
  Thomas

[1]
https://discuss.elastic.co/t/xml-converting-each-tag-attribute-as-the-actual-keys/249546/


On 17/09/2020 17.24, Thomas Hartmann wrote:
> Hi all,
> 
> I wonder, if somebody has already a logstash grok to mutate the results
> into a nicer format?
> 
> I am writing our CondorCE's event log as xml [1] and put a logstash grok
> onto it [2], that parses reasonably well the individual events [3.a]
> into JSONs [3.b].
> 
> Thing is, that I would like to mutate the <a n="foobar"> tags.
> AFAIS the `actual key` is always the tag's single n-attribute and a
> value is wrapped in one of the int/str/real type-tags.
> So, I am looking on how to best mutate the attribute to become the key
> (instead of the tag 'a') and carve out the value from the type-tag.
> 
> Since I am not really an expert with grok, I am hoping, that maybe
> somebody has already a mutate or so at hand, that I could borrow...? ;)
> 
> Cheers,
>   Thomas
> 
> [1]
> EVENT_LOG = /var/log/condor-ce/EventLog.xml
> EVENT_LOG_MAX_SIZE =  500000000
> EVENT_LOG_MAX_ROTATIONS = 4
> EVENT_LOG_USE_XML=True
> 
> ==============================================================
> 
> [2]
> input {
>   file {
>     path => "/var/log/condor-ce/EventLog.xml"
>     start_position => "beginning"
>     sincedb_path => "/var/log/condor-ce/.EventLog.sincedb"
>     exclude => "*.gz"
>     type => "xml"
>       codec => multiline {
>         pattern => "<c>"
>         negate => "true"
>         what => "previous"
>       }
>   }
> }
> 
> filter{
>     xml{
>         source => "message"
> 	store_xml => true
> 	target => "events"
> 	xpath => [
>             "/stations/station/id/text()", "station_id",
>             "/stations/station/name/text()", "station_name"
> 	]
>     }
> }
> 
> ==============================================================
> 
> [3.a]
> <c>
>     <a n="SentBytes"><r>0.0</r></a>
>     <a n="TotalRemoteUsage"><s>Usr 0 00:00:33, Sys 0 00:00:16</s></a>
>     <a n="TotalLocalUsage"><s>Usr 0 00:00:00, Sys 0 00:00:00</s></a>
>     <a n="EventTypeNumber"><i>5</i></a>
>     <a n="TotalSentBytes"><r>0.0</r></a>
>     <a n="Subproc"><i>0</i></a>
>     <a n="MyType"><s>JobTerminatedEvent</s></a>
>     <a n="RunRemoteUsage"><s>Usr 0 00:00:33, Sys 0 00:00:16</s></a>
>     <a n="EventTime"><s>2020-09-17T16:44:29.367</s></a>
>     <a n="Cluster"><i>64876</i></a>
>     <a n="Proc"><i>0</i></a>
>     <a n="ReceivedBytes"><r>0.0</r></a>
>     <a n="TerminatedNormally"><b v="t"/></a>
>     <a n="TotalReceivedBytes"><r>0.0</r></a>
>     <a n="ReturnValue"><i>0</i></a>
>     <a n="RunLocalUsage"><s>Usr 0 00:00:00, Sys 0 00:00:00</s></a>
> </c>
> 
> ==============================
> 
> [3.b]
>> grep TotalRemoteUsage /tmp/logstash.eventxml.json | head -n1 | jq .
> {
>   "host": "grid-htcondorce0.desy.de",
>   "events": {
>     "a": [
>       {
>         "n": "SentBytes",
>         "r": [
>           "0.0"
>         ]
>       },
>       {
>         "n": "TotalRemoteUsage",
>         "s": [
>           "Usr 0 00:00:33, Sys 0 00:00:16"
>         ]
>       },
>       {
>         "n": "TotalLocalUsage",
>         "s": [
>           "Usr 0 00:00:00, Sys 0 00:00:00"
>         ]
>       },
>       {
>         "n": "EventTypeNumber",
>         "i": [
>           "5"
>         ]
>       },
>       {
>         "n": "TotalSentBytes",
>         "r": [
>           "0.0"
>         ]
>       },
>       {
>         "n": "Subproc",
>         "i": [
>           "0"
>         ]
>       },
>       {
>         "n": "MyType",
>         "s": [
>           "JobTerminatedEvent"
>         ]
>       },
>       {
>         "n": "RunRemoteUsage",
>         "s": [
>           "Usr 0 00:00:33, Sys 0 00:00:16"
>         ]
>       },
>       {
>         "n": "EventTime",
>         "s": [
>           "2020-09-17T16:44:29.367"
>         ]
>       },
>       {
>         "n": "Cluster",
>         "i": [
>           "64876"
>         ]
>       },
>       {
>         "n": "Proc",
>         "i": [
>           "0"
>         ]
>       },
>       {
>         "n": "ReceivedBytes",
>         "r": [
>           "0.0"
>         ]
>       },
>       {
>         "n": "TerminatedNormally",
>         "b": [
>           {
>             "v": "t"
>           }
>         ]
>       },
>       {
>         "n": "TotalReceivedBytes",
>         "r": [
>           "0.0"
>         ]
>       },
>       {
>         "n": "ReturnValue",
>         "i": [
>           "0"
>         ]
>       },
>       {
>         "n": "RunLocalUsage",
>         "s": [
>           "Usr 0 00:00:00, Sys 0 00:00:00"
>         ]
>       }
>     ]
>   },
>   "type": "xml",
>   "@version": "1",
>   "@timestamp": "2020-09-17T15:00:21.876Z",
>   "message": "<c>\n    <a n=\"SentBytes\"><r>0.0</r></a>\n    <a
> n=\"TotalRemoteUsage\"><s>Usr 0 00:00:33, Sys 0 00:00:16</s></a>\n    <a
> n=\"TotalLocalUsage\"><s>Usr 0 00:00:00, Sys 0 00:00:00</s></a>\n    <a
> n=\"EventTypeNumber\"><i>5</i></a>\n    <a
> n=\"TotalSentBytes\"><r>0.0</r></a>\n    <a n=\"Subproc\"><i>0</i></a>\n
>    <a n=\"MyType\"><s>JobTerminatedEvent</s></a>\n    <a
> n=\"RunRemoteUsage\"><s>Usr 0 00:00:33, Sys 0 00:00:16</s></a>\n    <a
> n=\"EventTime\"><s>2020-09-17T16:44:29.367</s></a>\n    <a
> n=\"Cluster\"><i>64876</i></a>\n    <a n=\"Proc\"><i>0</i></a>\n    <a
> n=\"ReceivedBytes\"><r>0.0</r></a>\n    <a n=\"TerminatedNormally\"><b
> v=\"t\"/></a>\n    <a n=\"TotalReceivedBytes\"><r>0.0</r></a>\n    <a
> n=\"ReturnValue\"><i>0</i></a>\n    <a n=\"RunLocalUsage\"><s>Usr 0
> 00:00:00, Sys 0 00:00:00</s></a>\n</c>",
>   "tags": [
>     "multiline"
>   ],
>   "path": "/var/log/condor-ce/EventLog.xml"
> }
> 
> 
> 
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/
> 
input {
  file {
    path => "/var/log/condor-ce/EventLog"
    start_position => "beginning"
    sincedb_path => "/var/log/condor-ce/.EventLog.sincedb"
    exclude => "*.gz"
    type => "xml"
      codec => multiline {
        pattern => "<c>" 
        negate => "true"
        what => "previous"
      }
  }
}
filter{
    xml{
        source => "message"
        target => "xmlparse"
        force_array => false
#        store_xml => true
        namespaces => {
          "xsl" => "http://www.w3.org/1999/XSL/Transform";
          "xhtml" => "http://www.w3.org/1999/xhtml";
        }
#        add_tag => [ "xmltag" ]
    }
    ruby {
        code => '
            e = event.get("xmlparse")
            if e.is_a? Hash
                e["a"].each { |x|
                    key = x["n"]
                    if x["s"]
                        value = x["s"]
                    elsif x["i"]
                        value = x["i"].to_i
                    elsif x["r"]
                        value = x["r"].to_f
                    elsif x["b"]
                        value = (x["b"]["v"] == "t")
                    end
                    event.set(key, value)
                }
            end
        '
#        add_tag => [ "rubytag" ]
        add_tag => [ "condorce","eventlog","grid" ]
	remove_field => [ "xmlparse" ]
    }
}

output {
  stdout{
    codec => "json"
  }
  file {
    path => "/tmp/logstash.eventxml.debug"
    codec => "json_lines"
  }
}

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature