Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Lens] [Pie visualisation] incorrect visualisation for non-tsdb and tsdb data #157839

Closed
tetianakravchenko opened this issue May 16, 2023 · 26 comments
Labels
blocked bug Fixes for quality problems that affect the customer experience Feature:Lens impact:medium Addressing this issue will have a medium level of impact on the quality/strength of our product. Team:Visualizations Visualization editors, elastic-charts and infrastructure

Comments

@tetianakravchenko
Copy link
Contributor

tetianakravchenko commented May 16, 2023

Kibana version:
8.8.0-SNAPSHOT

Elasticsearch version:
8.8.0-SNAPSHOT
Server OS version:

Browser version:

Browser OS version:

Original install method (e.g. download page, yum, from source, etc.):
elastic-package local setup

Describe the bug:
Pie visualisation that contains non-TSDB and TSDB data does not provide correct data

Steps to reproduce:

  1. Install k8s integration package
  2. After some time of running, upgrade the k8s package version to the version that uses TSDB by default
  3. for the visualisation that contains Non-TSDB and TSDB data Pie chart does not provide correct data

Expected behavior:
For Non-TSDB data, for Non-TSDB and TSDB data, and for only TSDB data behavior is the same and correct

Screenshots (if relevant):
explanation:

  1. first I used specific time frame "non-tsdb data only" - all works
  2. than used last 15min time range - not correct data
  3. than used 4 last minutes - "tsdb data only" - all works
Screen.Recording.2023-05-16.at.10.14.56.mov

Errors in browser console (if relevant):

Provide logs and/or server output (if relevant):

Any additional context:

@tetianakravchenko tetianakravchenko added the bug Fixes for quality problems that affect the customer experience label May 16, 2023
@botelastic botelastic bot added the needs-team Issues missing a team label label May 16, 2023
@dej611 dej611 added the Team:Visualizations Visualization editors, elastic-charts and infrastructure label May 16, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-visualizations @elastic/kibana-visualizations-external (Team:Visualizations)

@botelastic botelastic bot removed the needs-team Issues missing a team label label May 16, 2023
@dej611
Copy link
Contributor

dej611 commented May 16, 2023

Is the TSDB data downsampled?

@mlunadia
Copy link

mlunadia commented May 16, 2023

@dej611 We are not doing anything with downsampling functionality, this is solely focused on testing the initial functionality for packages when TSDB is enabled. The scope only includes functionality in the current 8.8.0 snapshot and enabling TSDB.

@mlunadia
Copy link

@agithomas @ritalwar can you confirm if you also see this issue?

@lalit-satapathy
Copy link

@tetianakravchenko Is the issue only seen for: visualisation that has Pie chart?

I tried the steps below for nginx (which does not have pie chart):

TSDB disabled dashboard
Screenshot 2023-05-16 at 3 31 33 PM
TSDB enabled and index rollover. Check the dashboard with overlapping range and it works
Screenshot 2023-05-16 at 3 37 48 PM

@lalit-satapathy
Copy link

Hi Kibana team,

Can you confirm if there is an issue with Pie chat while mixing TSDB and non-TSDB data?

@stratoula
Copy link
Contributor

@dej611 can you take a look when you have some time?

@ritalwar
Copy link
Contributor

I also tested it for MSSQL and, like Lalit, I am not experiencing this issue, which may be because it doesn't involve any pie chart.

@dej611
Copy link
Contributor

dej611 commented May 16, 2023

@tetianakravchenko provided an instance to reproduce the issue.
In this instance I've managed to reproduce the problem and isolate it to an Elasticsearch problem, below the details.

The data has started coming at 16:13:13.
Rollover happened at 16:23:44.

If the terms agg has a order parameter set to use the last_value of a TSDB counter field, then the following problems happen:

  • if the time range filter starts and end before the rollover timing (16:13:13 - 16:23:43) then the correct data is reported
  • if the time range filter starts before the rollover and ends after it, then null values are returned (16:13:13 - 16:23:44 - 1 sec after)
  • if the time range filter starts and ends after the rollover timing, then the correct data is reported.

The following is the query used in the visualization (note query timing is UTC, consider 2 hours shift, so the time interval is crossing the rollover):

{
  "aggs": {
    "0": {
      "terms": {
        "field": "kubernetes.apiserver.request.resource",
        "order": {
          "1-bucket>1-metric[kubernetes.apiserver.etcd.object.count]": "desc"
        },
        "size": 10,
        "shard_size": 25
      },
      "aggs": {
        "1-bucket": {
          "filter": {
            "bool": {
              "must": [],
              "filter": [
                {
                  "bool": {
                    "should": [
                      {
                        "exists": {
                          "field": "kubernetes.apiserver.etcd.object.count"
                        }
                      }
                    ],
                    "minimum_should_match": 1
                  }
                }
              ],
              "should": [],
              "must_not": []
            }
          },
          "aggs": {
            "1-metric": {
              "top_metrics": {
                "metrics": {
                  "field": "kubernetes.apiserver.etcd.object.count"
                },
                "size": 1,
                "sort": {
                  "@timestamp": "desc"
                }
              }
            }
          }
        }
      }
    }
  },
  "size": 0,
  "fields": [
    {
      "field": "@timestamp",
      "format": "date_time"
    },
    {
      "field": "event.ingested",
      "format": "date_time"
    },
    {
      "field": "kubernetes.container.start_time",
      "format": "date_time"
    },
    {
      "field": "kubernetes.event.metadata.timestamp.created",
      "format": "date_time"
    },
    {
      "field": "kubernetes.event.timestamp.first_occurrence",
      "format": "date_time"
    },
    {
      "field": "kubernetes.event.timestamp.last_occurrence",
      "format": "date_time"
    },
    {
      "field": "kubernetes.node.start_time",
      "format": "date_time"
    },
    {
      "field": "kubernetes.pod.start_time",
      "format": "date_time"
    },
    {
      "field": "kubernetes.service.created",
      "format": "date_time"
    },
    {
      "field": "kubernetes.storageclass.created",
      "format": "date_time"
    },
    {
      "field": "kubernetes.system.start_time",
      "format": "date_time"
    },
    {
      "field": "process.cpu.start_time",
      "format": "date_time"
    },
    {
      "field": "system.process.cpu.start_time",
      "format": "date_time"
    }
  ],
  "script_fields": {},
  "stored_fields": [
    "*"
  ],
  "runtime_mappings": {},
  "_source": {
    "excludes": []
  },
  "query": {
    "bool": {
      "must": [],
      "filter": [
        {
          "match_phrase": {
            "data_stream.dataset": "kubernetes.apiserver"
          }
        },
        {
          "range": {
            "@timestamp": {
              "format": "strict_date_optional_time",
              "gte": "2023-05-16T14:13:00.122Z",
              "lte": "2023-05-16T14:25:44.122Z"
            }
          }
        }
      ],
      "should": [],
      "must_not": []
    }
  }
}

The response for the query above is the following:

{
  "took": 1808,
  "timed_out": false,
  "_shards": {
    "total": 34,
    "successful": 34,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 10000,
      "relation": "gte"
    },
    "max_score": null,
    "hits": []
  },
  "aggregations": {
    "0": {
      "doc_count_error_upper_bound": -1,
      "sum_other_doc_count": 8282,
      "buckets": [
        {
          "key": "allowlistedworkloads",
          "doc_count": 88,
          "1-bucket": {
            "doc_count": 0,
            "1-metric": {
              "hits": {
                "total": {
                  "value": 0,
                  "relation": "eq"
                },
                "max_score": null,
                "hits": []
              }
            }
          },
          "0-orderAgg": {
            "top": [
              {
                "sort": [
                  "2023-05-16T14:23:44.010Z"
                ],
                "metrics": {
                  "kubernetes.apiserver.etcd.object.count": null
                }
              }
            ]
          }
        },
        {
          "key": "apiservices",
          "doc_count": 242,
          "1-bucket": {
            "doc_count": 0,
            "1-metric": {
              "hits": {
                "total": {
                  "value": 0,
                  "relation": "eq"
                },
                "max_score": null,
                "hits": []
              }
            }
          },
          "0-orderAgg": {
            "top": [
              {
                "sort": [
                  "2023-05-16T14:23:44.010Z"
                ],
                "metrics": {
                  "kubernetes.apiserver.etcd.object.count": null
                }
              }
            ]
          }
        },
        {
          "key": "backendconfigs",
          "doc_count": 44,
          "1-bucket": {
            "doc_count": 0,
            "1-metric": {
              "hits": {
                "total": {
                  "value": 0,
                  "relation": "eq"
                },
                "max_score": null,
                "hits": []
              }
            }
          },
          "0-orderAgg": {
            "top": [
              {
                "sort": [
                  "2023-05-16T14:23:44.010Z"
                ],
                "metrics": {
                  "kubernetes.apiserver.etcd.object.count": null
                }
              }
            ]
          }
        },
        {
          "key": "certificatesigningrequests",
          "doc_count": 132,
          "1-bucket": {
            "doc_count": 0,
            "1-metric": {
              "hits": {
                "total": {
                  "value": 0,
                  "relation": "eq"
                },
                "max_score": null,
                "hits": []
              }
            }
          },
          "0-orderAgg": {
            "top": [
              {
                "sort": [
                  "2023-05-16T14:23:44.010Z"
                ],
                "metrics": {
                  "kubernetes.apiserver.etcd.object.count": null
                }
              }
            ]
          }
        },
        {
          "key": "clusterrolebindings",
          "doc_count": 198,
          "1-bucket": {
            "doc_count": 0,
            "1-metric": {
              "hits": {
                "total": {
                  "value": 0,
                  "relation": "eq"
                },
                "max_score": null,
                "hits": []
              }
            }
          },
          "0-orderAgg": {
            "top": [
              {
                "sort": [
                  "2023-05-16T14:23:44.010Z"
                ],
                "metrics": {
                  "kubernetes.apiserver.etcd.object.count": null
                }
              }
            ]
          }
        },
        {
          "key": "clusterroles",
          "doc_count": 198,
          "1-bucket": {
            "doc_count": 0,
            "1-metric": {
              "hits": {
                "total": {
                  "value": 0,
                  "relation": "eq"
                },
                "max_score": null,
                "hits": []
              }
            }
          },
          "0-orderAgg": {
            "top": [
              {
                "sort": [
                  "2023-05-16T14:23:44.010Z"
                ],
                "metrics": {
                  "kubernetes.apiserver.etcd.object.count": null
                }
              }
            ]
          }
        },
        {
          "key": "componentstatuses",
          "doc_count": 44,
          "1-bucket": {
            "doc_count": 0,
            "1-metric": {
              "hits": {
                "total": {
                  "value": 0,
                  "relation": "eq"
                },
                "max_score": null,
                "hits": []
              }
            }
          },
          "0-orderAgg": {
            "top": [
              {
                "sort": [
                  "2023-05-16T14:23:44.010Z"
                ],
                "metrics": {
                  "kubernetes.apiserver.etcd.object.count": null
                }
              }
            ]
          }
        },
        {
          "key": "cronjobs",
          "doc_count": 88,
          "1-bucket": {
            "doc_count": 0,
            "1-metric": {
              "hits": {
                "total": {
                  "value": 0,
                  "relation": "eq"
                },
                "max_score": null,
                "hits": []
              }
            }
          },
          "0-orderAgg": {
            "top": [
              {
                "sort": [
                  "2023-05-16T14:23:44.010Z"
                ],
                "metrics": {
                  "kubernetes.apiserver.etcd.object.count": null
                }
              }
            ]
          }
        },
        {
          "key": "csidrivers",
          "doc_count": 88,
          "1-bucket": {
            "doc_count": 0,
            "1-metric": {
              "hits": {
                "total": {
                  "value": 0,
                  "relation": "eq"
                },
                "max_score": null,
                "hits": []
              }
            }
          },
          "0-orderAgg": {
            "top": [
              {
                "sort": [
                  "2023-05-16T14:23:44.010Z"
                ],
                "metrics": {
                  "kubernetes.apiserver.etcd.object.count": null
                }
              }
            ]
          }
        },
        {
          "key": "csinodes",
          "doc_count": 242,
          "1-bucket": {
            "doc_count": 0,
            "1-metric": {
              "hits": {
                "total": {
                  "value": 0,
                  "relation": "eq"
                },
                "max_score": null,
                "hits": []
              }
            }
          },
          "0-orderAgg": {
            "top": [
              {
                "sort": [
                  "2023-05-16T14:23:44.010Z"
                ],
                "metrics": {
                  "kubernetes.apiserver.etcd.object.count": null
                }
              }
            ]
          }
        }
      ]
    }
  }
}

Now if I change the lte clause before the rollover, or move the gte after - as described - it the response will be something like:

{
  "took": 2728,
  "timed_out": false,
  "_shards": {
    "total": 34,
    "successful": 34,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 10000,
      "relation": "gte"
    },
    "max_score": null,
    "hits": []
  },
  "aggregations": {
    "0": {
      "doc_count_error_upper_bound": -1,
      "sum_other_doc_count": 5776,
      "buckets": [
        {
          "key": "clusterroles.rbac.authorization.k8s.io",
          "doc_count": 16,
          "1-bucket": {
            "doc_count": 16,
            "1-metric": {
              "hits": {
                "total": {
                  "value": 16,
                  "relation": "eq"
                },
                "max_score": null,
                "hits": [
                  {
                    "_index": "xxxxxxx",
                    "_id": "yyyyyyyy",
                    "_score": null,
                    "fields": {
                      "kubernetes.apiserver.etcd.object.count": [
                        100
                      ]
                    },
                    "sort": [
                      1684246844010
                    ]
                  }
                ]
              }
            }
          },
          "0-orderAgg": {
            "top": [
              {
                "sort": [
                  "2023-05-16T14:20:44.010Z"
                ],
                "metrics": {
                  "kubernetes.apiserver.etcd.object.count": 100
                }
              }
            ]
          }
        },
        ...
      ]
    }
  }
}

@lalit-satapathy
Copy link

@dej611,

Not clear what is the exact issue? Is this issue applicable only for Pie charts or other visualisations also?

In this instance I've managed to reproduce the problem and isolate it to an Elasticsearch problem, below the details.

@martijnvg, Can you help confirm if this elasticsearch problem?

@dej611
Copy link
Contributor

dej611 commented May 17, 2023

Any query/visualization with a terms agg which is ordered by a top_metrics/Last value metric is affected by what I've reproduced.

@agithomas
Copy link

@agithomas @ritalwar can you confirm if you also see this issue?

The integrations i have been testing does not have any pie-cart to verify this. So, couldn't find a usecase yet to verify.

@dej611
Copy link
Contributor

dej611 commented May 17, 2023

The same issue can be reproduced with any chart in Lens, as long as a terms ordered by a top_metrics using a counter field is used:

  • time range starts and ends before the rollover:
Screenshot 2023-05-17 at 10 21 45
  • time range starts before rollover but ends after it:
Screenshot 2023-05-17 at 10 21 59
  • time range starts and ends after rollover:
Screenshot 2023-05-17 at 10 22 18

@martijnvg
Copy link
Member

@dej611 This looks strange and is unexpected. Can you open an ES issue? Does the search response returns shards failures?

@mlunadia mlunadia changed the title [Lens] [Pie visualisation] incorrect visualisation for non-tsdb and tsdb data [Lens] Lens visualisations for non-tsdb and tsdb data display incorrectly when a terms ordered by a top_metrics using a counter field is used May 17, 2023
@dej611
Copy link
Contributor

dej611 commented May 17, 2023

@martijnvg no shards failures.

@salvatore-campagna
Copy link

salvatore-campagna commented May 18, 2023

Could you share mappings for the index? At least for the two fields involved...resource and count. Probably they are keyword and counter.

@salvatore-campagna
Copy link

It would be helpful also to have the setting for both involved indices.

@mlunadia
Copy link

Could you share mappings for the index? At least for the two fields involved...resource and count. Probably they are keyword and counter.
It would be helpful also to have the setting for both involved indices.

@constanca-m can you liaise with @salvatore-campagna to make this info available?

@constanca-m
Copy link

@salvatore-campagna , the resource is a keyword:

    - name: request.resource
      dimension: true
      type: keyword
      description: |
        Requested resource

And the count is of type gauge:

    - name: etcd.object.count
      type: long
      metric_type: gauge
      description: Number of kubernetes objects at etcd

@dej611
Copy link
Contributor

dej611 commented May 23, 2023

Had to attach both mappings and settings as txt as they were too big.

tsdb-rolled-over-mappings.txt
tsdb-rolled-over-settings.txt

@salvatore-campagna
Copy link

As described in elastic/elasticsearch#96192 I tried to reproduce this issue using a YAML test but I did not manage. I wonder if there is any chance to look at Elasticsearch logs and see if there is anything there which might help.

@salvatore-campagna
Copy link

Would it be possible to run the query using profile: true?

@mlunadia mlunadia changed the title [Lens] Lens visualisations for non-tsdb and tsdb data display incorrectly when a terms ordered by a top_metrics using a counter field is used [Lens] [Pie visualisation] incorrect visualisation for non-tsdb and tsdb data May 25, 2023
@stratoula stratoula added the impact:medium Addressing this issue will have a medium level of impact on the quality/strength of our product. label Jun 1, 2023
@dej611
Copy link
Contributor

dej611 commented Sep 4, 2023

We should consider to rename this issue into something closer to the actual problem.
Currently it is not a Kibana issue, and managed to isolate the problem at ES level.

@stratoula
Copy link
Contributor

@dej611 do you think we can close it as it is not a kibana issue and it will be solved when ES fixes it?

@dej611
Copy link
Contributor

dej611 commented Sep 5, 2023

I would rather prefer to keep it open but with a meaningful title and labels to track it.

@dej611
Copy link
Contributor

dej611 commented Sep 5, 2023

Discussed offline and decided to close this and track only the ES one.

@dej611 dej611 closed this as completed Sep 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked bug Fixes for quality problems that affect the customer experience Feature:Lens impact:medium Addressing this issue will have a medium level of impact on the quality/strength of our product. Team:Visualizations Visualization editors, elastic-charts and infrastructure
Projects
None yet
Development

No branches or pull requests