Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Adds ML modules for Metrics UI Integration #76460

Merged
merged 13 commits into from
Sep 17, 2020

Conversation

blaklaybul
Copy link
Contributor

@blaklaybul blaklaybul commented Sep 1, 2020

Summary

Adds the files for a new metrics_ui_hosts and metrics_ui_k8s modules for use within the Metrics app, containing the job and datafeed configuration files that support the analyses designed for the metrics integration.

For each modules, this PR contains:

  • module manifest.json containing a query that uniquely defines when the module should appear in the ML app.
  • ML Job configurations for 4 jobs:
    • {hosts||k8s}_cpu_usage
    • {hosts||k8s}_memory_usage
    • {hosts||k8s}_network_in
    • {hosts||k8s}_network_out
  • Datafeed configurations to accompany the jobs.
  • Logo

To Do:

  • provide job and module descriptions, finalize titles

@blaklaybul blaklaybul added the :ml label Sep 1, 2020
},
"jobs": [
{
"id": "hosts_cpu_usage",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noticed that the host jobs have an id prefixed with hosts, but the k8s jobs don't. Should we be consistent here? Not sure we actually need the prefix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i've included the k8s prefix on the kubernetes jobs, as we'll want to distinguish between them in ML job management.

@blaklaybul
Copy link
Contributor Author

Custom URLs have been added for all jobs. The app/ prefix has been left off, so they will not work as is. We're awaiting a PR from @peteharverson that will include metrics in our isKibanaUrl check for custom URLs. Also, to accommodate the new URLs, kubernetes.pod.name has been replaced by kubernetes.pod.id in the terms aggs in datafeed_k8s_network_in.json and datafeed_k8s_network_out.json. This ensures that these field values are passed to the datafeed and can therefore be used as influencers in the URL creation.

{
"id": "metrics_ui_hosts",
"title": "Metrics Hosts",
"description": "Detect anomalous memory, cpu, and network behavior on hosts.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"description": "Detect anomalous memory, cpu, and network behavior on hosts.",
"description": "Detect anomalous memory, CPU, and network behavior on hosts.",

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated - thanks!

"hosts",
"metrics"
],
"description": "Metrics: Hosts - Identify unusual spikes in cpu utilization across hosts.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"description": "Metrics: Hosts - Identify unusual spikes in cpu utilization across hosts.",
"description": "Metrics: Hosts - Identify unusual spikes in CPU utilization across hosts.",

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated - thanks!

{
"id": "metrics_ui_k8s",
"title": "Metrics Kubernetes",
"description": "Detect anomalous memory, cpu, and network behavior on kubernetes pods.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"description": "Detect anomalous memory, cpu, and network behavior on kubernetes pods.",
"description": "Detect anomalous memory, CPU, and network behavior on Kubernetes pods.",

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated - thanks!

"k8s",
"metrics"
],
"description": "Metrics: Kubernetes - Identify unusual spikes in cpu utilization across kubernetes pods.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"description": "Metrics: Kubernetes - Identify unusual spikes in cpu utilization across kubernetes pods.",
"description": "Metrics: Kubernetes - Identify unusual spikes in CPU utilization across Kubernetes pods.",

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated - thanks!

"k8s",
"metrics"
],
"description": "Metrics: Kubernetes - Identify unusual spikes in memory usage across kubernetes pods.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"description": "Metrics: Kubernetes - Identify unusual spikes in memory usage across kubernetes pods.",
"description": "Metrics: Kubernetes - Identify unusual spikes in memory usage across Kubernetes pods.",

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated - thanks!

@@ -0,0 +1,42 @@
{
"job_type": "anomaly_detector",
"description": "Metrics: Kubernetes - Identify unusual spikes in inbound traffic across kubernetes pods.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"description": "Metrics: Kubernetes - Identify unusual spikes in inbound traffic across kubernetes pods.",
"description": "Metrics: Kubernetes - Identify unusual spikes in inbound traffic across Kubernetes pods.",

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated - thanks!

@@ -0,0 +1,42 @@
{
"job_type": "anomaly_detector",
"description": "Metrics: Kubernetes - Identify unusual spikes in outbound traffic across kubernetes pods.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"description": "Metrics: Kubernetes - Identify unusual spikes in outbound traffic across kubernetes pods.",
"description": "Metrics: Kubernetes - Identify unusual spikes in outbound traffic across Kubernetes pods.",

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated - thanks!

@blaklaybul
Copy link
Contributor Author

blaklaybul commented Sep 15, 2020

@phillipb I wanted to lay out how you will need to override the job and datafeed configs based on the partition fields provided by the user (using cloud.project.id in the following examples). The flow will be slightly different for the hosts and k8s modules:

metrics-ui-hosts

job configurations

The analysis_config object in each job configuration will need to include "partition_field_name": "cloud.project.id" and cloud.project.id will need to be added to the list of influencers. So, for example, for the job hosts_network_in, the analysis_config will need appear as follows:

"analysis_config": {
      "bucket_span": "15m",
      "detectors": [
        {
          "detector_description": "max(bytes_in_derivative)",
          "function": "max",
          "field_name": "bytes_in_derivative",
          "parition_field_name": "cloud.project.id"
        }
      ],
      "influencers": [
        "host.name",
        "cloud.project.id"
        ],
      "summary_count_field_name": "doc_count"
    }

datafeed configurations

For the metrics-ui-hosts module, all datafeeds with the exception of datafeed_hosts_memory_usage use aggregations - for these, we will need to wrap the existing aggregation in a terms agg on the user-supplied partition field. This terms agg must have a name matching the user-supplied field. Using datafeed_hosts_network_in as an example, the aggregations object will need to appear as such:

{
    "aggregations": {
        "cloud.project.id": {
            "terms": {
                "field": "cloud.project.id"
            },
            "aggregations": {
                "host.name": {"terms": {"field": "host.name"},
                    "aggregations": {
                        "buckets": {
                            "date_histogram": {"field": "@timestamp","fixed_interval": "5m"},
                            "aggregations": {
                                "@timestamp": {"max": {"field": "@timestamp"}},
                                "bytes_in_max": {"max": {"field": "system.network.in.bytes"}},
                                "bytes_in_derivative": {"derivative": {"buckets_path": "bytes_in_max"}},
                                "positive_only":{
                                    "bucket_script": {
                                        "buckets_path": {"in_derivative": "bytes_in_derivative.value"},
                                        "script": "params.in_derivative > 0.0 ? params.in_derivative : 0.0"
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}

metrics-ui-k8s

job configurations

Since the metrics-ui-k8s module is shipping with a default partition field - kubernetes.namespace - the detectors in analysis_config already contain "partition_field_name": "kubernetes.namespace". If the user chooses to override the default, all you will need to do is replace kubernetes.namespace with the user supplied field in the detector and in the influencer list.

datafeed configurations

Only datafeed_k8s_network_in and datafeed_k8s_network_out contains aggregations in the metrics-ui-k8s module. Since we are supplying a default partition field, the outer aggregations are already in the configs. So for these datafeeds, all you will need to do is replace kubernetes.namespace with the user-supplied field name in the outer aggregation in 2 places - the name of the agg and the "field" value.

@blaklaybul blaklaybul marked this pull request as ready for review September 16, 2020 00:01
@blaklaybul blaklaybul requested a review from a team as a code owner September 16, 2020 00:01
@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

@blaklaybul blaklaybul self-assigned this Sep 16, 2020
@peteharverson
Copy link
Contributor

peteharverson commented Sep 16, 2020

Worth noting here for reference, that there is an open issue that we currently do not support the plot of metric data in the Anomaly Explorer or Single Metric Viewer charts for detectors which use a derivative aggregation on a scripted field - #18464. This is because of the difficulties of reverse engineering the datafeed config aggregations back to a search to run on the source data to obtain the metric data for plotting in the charts. Currently the charts just display blank.

image

This will affect the four inbound / outbound traffic jobs.

Similarly, the hosts CPU usage job uses a bucket_script for the CPU metric in the datafeed, so again, the metric data in the anomaly charts will be blank.

Copy link
Contributor

@peteharverson peteharverson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested these two modules with the metrics-ui-full data set and overall it looks good. A couple of questions about descriptions for detectors, plus whether we want to remove the query section from the hosts module before merging.

"type": "Metricbeat Data",
"logoFile": "logo.json",
"defaultIndexPattern": "metricbeat-*",
"query": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the jobs in this module provide no value without the specific overrides we are expecting from the Metrics UI, then removing this query block is the way to hide it from the ML job wizards.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the query and defaultIndexPattern fields have been removed, as we do in the logs integration modules.

@blaklaybul
Copy link
Contributor Author

As per @sorantis 's request, the CPU jobs have been removed from both modules.

@blaklaybul
Copy link
Contributor Author

@elasticmachine merge upstream

Copy link
Contributor

@peteharverson peteharverson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest edits LGTM

Copy link
Contributor

@lcawl lcawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Descriptions LGTM

@blaklaybul
Copy link
Contributor Author

@elasticmachine merge upstream

@peteharverson
Copy link
Contributor

@elasticmachine merge upstream

@blaklaybul blaklaybul merged commit 6d12c68 into elastic:master Sep 17, 2020
blaklaybul pushed a commit to blaklaybul/kibana that referenced this pull request Sep 17, 2020
* adds metrics ml integration

* renames jobs, updates datafeeds

* adds allow_no_indices: true for datafeeds

* updates module ids in manifest

* adds custom urls

* adds module and individual job descriptions

* removes model plots

* updates terms agg sizes

* updates chunking config

* removes query and default index pattern from manifest, updates descriptions

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
blaklaybul pushed a commit that referenced this pull request Sep 17, 2020
* adds metrics ml integration

* renames jobs, updates datafeeds

* adds allow_no_indices: true for datafeeds

* updates module ids in manifest

* adds custom urls

* adds module and individual job descriptions

* removes model plots

* updates terms agg sizes

* updates chunking config

* removes query and default index pattern from manifest, updates descriptions

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Build metrics

distributable file count

id value diff baseline
default 45934 +16 45918

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

jloleysens added a commit to jloleysens/kibana that referenced this pull request Sep 18, 2020
…rok/new-patterns-component-use-array

* 'master' of github.com:elastic/kibana: (140 commits)
  Add telemetry as an automatic privilege grant (elastic#77390)
  [Security Solutions][Cases] Cases Redesign (elastic#73247)
  Use Search API in TSVB (elastic#76274)
  [Mappings editor] Add support for constant_keyword field type (elastic#76564)
  [ML] Adds ML modules for Metrics UI Integration (elastic#76460)
  [Drilldowns] {{event.points}} in URL drilldown for VALUE_CLICK_TRIGGER (elastic#76771)
  Migrate status & stats APIs to KP + remove legacy status lib (elastic#76054)
  use App updater API instead of deprecated chrome.navLinks.update (elastic#77708)
  [CSM Dashboard] Remove points from line chart (elastic#77617)
  [APM] Trace timeline: Replace multi-fold function icons with new EuiIcon glyphs (elastic#77470)
  [Observability] Overview: Alerts section style improvements (elastic#77670)
  Bump the Node.js version used by Docker in CI (elastic#77714)
  Upgrade all minimist (sub)dependencies to version ^1.2.5 (elastic#60284)
  Remove unneeded forced package resolutions (elastic#77467)
  [ML] Add metrics app to check made for internal custom URLs (elastic#77627)
  Functional tests - add supertest for test_user (elastic#77584)
  [ML] Adding option to create AD jobs without starting the datafeed (elastic#77484)
  Bump node-fetch to 2.6.1 (elastic#77445)
  Bump sharkdown from v0.1.0 to v0.1.1 (elastic#77607)
  [APM]fixing y axis on transaction error rate to 100% (elastic#77609)
  ...

# Conflicts:
#	x-pack/plugins/ingest_pipelines/public/application/components/pipeline_processors_editor/components/manage_processor_form/manage_processor_form.container.tsx
#	x-pack/plugins/ingest_pipelines/public/application/components/pipeline_processors_editor/components/manage_processor_form/manage_processor_form.tsx
#	x-pack/plugins/ingest_pipelines/public/application/components/pipeline_processors_editor/components/processor_form/field_components/drag_and_drop_text_list.scss
#	x-pack/plugins/ingest_pipelines/public/application/components/pipeline_processors_editor/components/processor_form/field_components/drag_and_drop_text_list.tsx
#	x-pack/plugins/ingest_pipelines/public/application/components/pipeline_processors_editor/components/processor_form/field_components/text_editor.scss
#	x-pack/plugins/ingest_pipelines/public/application/components/pipeline_processors_editor/components/processor_form/processors/grok.test.tsx
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants