[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] adstash / grafana question: grouping by Cmd basename



Hi Jason,

I ended up getting one step closer to what you're describing with a grok processor in an ingest node pipeline in elasticsearch itself.

Related question: is there any current or planned support for ingesting histories from old history files?

The following ingest pipeline puts everything in a weekly time-stamped index, parses CompletionDate into @timestamp, and finally adds a new field, "cmd_name", which has what I was after.

[
 {
   "date_index_name": {
     "field": "CompletionDate",
     "date_rounding": "w",
     "index_name_prefix": "htcondor-igwn-",
     "date_formats": [
       "UNIX"
     ]
   }
 },
 {
   "date": {
     "field": "CompletionDate",
     "formats": [
       "UNIX"
     ]
   }
 },
 {
   "grok": {
     "field": "Cmd",
     "patterns": [
       "^.*/%{DATA:cmd_name}$",
       "^%{DATA:cmd_name}$"
     ]
   }
 }
]

The latter part of which was spelled out in:

https://discuss.elastic.co/t/ingest-pipeline-to-parse-basename-from-path/205809

On 10/11/21 2:57 PM, Jason Patton wrote:
Hi James,

I'm not sure how to do this from the Grafana side, hopefully someone else can chime in there. I would like to have some clean way doing this *upstream* by letting adstash users add their own Python code for creating their own ES document fields (using data from the job ad or whatever else they want to do), and I'm happy to take suggestions on how we might go about doing this. As the code is now, you *could* edit convert.py in the library code (in $(condor_config_val libexec)/adstash) to add a new text or keyword field, compute and add the value for this field in the "to_json()" function, and then re-do your ES index(es), but this is clearly not a good solution for many reasons :)

Jason Patton

On Fri, Oct 8, 2021 at 8:06 AM James Alexander Clark <jaclark@xxxxxxxxxxx <mailto:jaclark@xxxxxxxxxxx>> wrote:

    Hello,

    I would like to build a dashboard in which job history metrics gathered
    by adstash and pumped to elasticsesarch are grouped by executable
    *basename*.

    The motivation is we have users (myself included) who run jobs from
    e.g.
    shell scripts copied to an execute directory but which are
    fundamentally
    the same process - I would like to monitor such jobs for failures and
    general performance, broken down by the IGWN/OSG site at which they ran.

    My plan was to build a variable with:

    {"find": "terms", "field": "Cmd"}

    and strip out the path with a regex, leaving the basename - that gives
    me a drop-down menu of executable basenames I'd like to use in the
    panel
    query.

    My immediate problem seemed to be that Cmd is not an indexed field, so
    it looks like I cannot do any kind of straightforward searching on that.

    Instead, I figured maybe I can just abandon using a variable in a query
    and simply group-by the Cmd field and then use a rename-by-regex
    transform to give me the basenames. That at least gives me a panel
    where I can select individual traces to give basically the same result.

    One problem: I just get a ton of traces with identical names
    (corresponding to the now absent paths).

    Anyone know if there's a further grafana transform I can use to group
    those identically-named results together?

    Or maybe the better thing to do is just add an attribute, either
    through
    a job transform or (maybe) elasticsearch ingest node pipeline, that has
    the desired information so it's there in the db in the first place?

    Hopefully this isn't too obscure / into the weeds to be meaningful..
    Thanks

-- James Alexander Clark
    LIGO Laboratory
    California Institute of Technology
    email: james.clark@xxxxxxxx <mailto:james.clark@xxxxxxxx>
    Tel. (cell):Â 413-230-1412
    _______________________________________________
    HTCondor-users mailing list
    To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
    <mailto:htcondor-users-request@xxxxxxxxxxx> with a
    subject: Unsubscribe
    You can also unsubscribe by visiting
    https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
    <https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users>

    The archives can be found at:
    https://lists.cs.wisc.edu/archive/htcondor-users/
    <https://lists.cs.wisc.edu/archive/htcondor-users/>


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


--
James Alexander Clark
LIGO Laboratory
California Institute of Technology
email:  james.clark@xxxxxxxx
Tel. (cell):  413-230-1412