Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation for kibana:plugin_render_time #184206

Merged
merged 9 commits into from
May 29, 2024
202 changes: 166 additions & 36 deletions dev_docs/tutorials/performance/adding_custom_performance_metrics.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,12 @@ tags: ['kibana', 'onboarding', 'setup', 'performance', 'development', 'telemetry
---

# Build and track custom performance metrics

Having access to performance metrics allows us to better understand user experience across Kibana, identify issues and fix it.
Custom metrics allows to monitor critical flows like server start, saved objects fetching or dashboard loading times.

## Instrument your code to report metric event.
## Instrument your code to report custom metric event.

We use event-based telemetry (EBT) to report client-side metrics as events.
If you want to add a custom metric on server side, please notify the #kibana-core team in advance.

Expand All @@ -28,7 +30,7 @@ Once we have the time measurement, we can use the `reportPerformanceMetricEvent`

```typescript
reportPerformanceMetricEvent(analytics, {
eventName: APP_ACTION,
eventName: APP_ACTION,
duration: actionDuration,
});
```
Expand All @@ -42,7 +44,7 @@ Each document in the index has the following structure:

```typescript
{
"_index": "backing-ebt-kibana-browser-performance-metrics-000001", // Performance metrics are stored at a dedicated simplified index (browser \ server).
"_index": "backing-ebt-kibana-browser-performance-metrics-000001", // Performance metrics are stored at a dedicated simplified index (browser \ server).
"_source": {
"timestamp": "2022-08-31T11:29:58.275Z"
"event_type": "performance_metric", // All events share a common event type to simplify mapping
Expand All @@ -67,26 +69,27 @@ Each document in the index has the following structure:

### Performance events with breakdowns and metadata

Lets assume we are interested in benchmarking the performance of a more complex event `COMPLEX_APP_ACTION`, that is made up of two steps:
- `INSPECT_DATA` measures the time it takes to retrieve a user's profile and check if there is a cached version of their data.
- If the cached data is fresh it proceeds with a flow `use-local-data`
- If data needs to be refreshed, it proceeds with a flow `load-data-from-api`.
- `PROCESS_DATA` loads and processes the data depending on the flow chosen in the previous step.
Lets assume we are interested in benchmarking the performance of a more complex event `COMPLEX_APP_ACTION`, that is made up of two steps:

- `INSPECT_DATA` measures the time it takes to retrieve a user's profile and check if there is a cached version of their data.
- If the cached data is fresh it proceeds with a flow `use-local-data`
- If data needs to be refreshed, it proceeds with a flow `load-data-from-api`.
- `PROCESS_DATA` loads and processes the data depending on the flow chosen in the previous step.

We could utilize the additional options supported by the `reportPerformanceMetricEvent` API:

```typescript
import { reportPerformanceMetricEvent } from '@kbn/ebt-tools';

reportPerformanceMetricEvent(analytics, {
eventName: COMPLEX_APP_ACTION,
duration, // Total duration in milliseconds
key1 : INSPECT_DATA, // Claiming free key1 to be used for INSPECT_DATA
value1 : durationOfStepA, // Total duration of step INSPECT_DATA in milliseconds
key2 : PROCESS_DATA, // Claiming free key2 to be used for PROCESS_DATA
value2 : durationOfStepB, // Total duration of step PROCESS_DATA in milliseconds
eventName: COMPLEX_APP_ACTION,
duration, // Total duration in milliseconds
key1: INSPECT_DATA, // Claiming free key1 to be used for INSPECT_DATA
value1: durationOfStepA, // Total duration of step INSPECT_DATA in milliseconds
key2: PROCESS_DATA, // Claiming free key2 to be used for PROCESS_DATA
value2: durationOfStepB, // Total duration of step PROCESS_DATA in milliseconds
meta: {
dataSource: 'flow2', // Providing event specific context. This can be useful to create meaningful aggregations.
dataSource: 'flow2', // Providing event specific context. This can be useful to create meaningful aggregations.
},
});
```
Expand All @@ -95,7 +98,7 @@ This event will be indexed with the following structure:

```typescript
{
"_index": "backing-ebt-kibana-browser-performance-metrics-000001", // Performance metrics are stored in a dedicated simplified index (browser \ server).
"_index": "backing-ebt-kibana-browser-performance-metrics-000001", // Performance metrics are stored in a dedicated simplified index (browser \ server).
"_source": {
"timestamp": "2022-08-31T11:29:58.275Z"
"event_type": "performance_metric", // All events share a common event type to simplify mapping
Expand All @@ -106,8 +109,8 @@ This event will be indexed with the following structure:
"key2": PROCESS_DATA, // The key name of PROCESS_DATA
"value2": 520, // The duration of step PROCESS_DATA
"meta": {
"dataSource": 'load-data-from-api',
},
"dataSource": 'load-data-from-api',
},
"context": { // Context holds information identifying the deployment, version, application and page that generated the event
"version": "8.5.0-SNAPSHOT",
"cluster_name": "job-ftr_configs_2-cluster-ftr",
Expand All @@ -127,52 +130,54 @@ This event will be indexed with the following structure:
```

The performance metrics API supports **5 numbered free fields** that can be used to report numeric metrics that you intend to analyze.
Note that they can be used for any type of numeric information you may want to report and use to create your own flexible schema,
Note that they can be used for any type of numeric information you may want to report and use to create your own flexible schema,
without having to add custom mappings.

If you want to provide event specific context, you can add properties to the `meta` field.
The `meta` object is stored as a [flattened field](https://www.elastic.co/guide/en/elasticsearch/reference/current/flattened.html) hence
The `meta` object is stored as a [flattened field](https://www.elastic.co/guide/en/elasticsearch/reference/current/flattened.html) hence
it's searchable and can be used to further breakdown event metrics.

**Note**: It's important to keep in mind `free field` values are integers and floating point values will be rounded.

### How to choose and measure events

Events should be meaningful and can have multiple sub metrics which will give specific information of certain actions. For example
page-load events can be composed of render time, data load time during the page-load and so on. It's important to understand these
events will have meaning for performance investigations and that can be used in visualizations, aggregations. Considering this,
creating an event for cpuUsage does not bring any value because it doesn't bring any context with itself and reporting multiple of these
Events should be meaningful and can have multiple sub metrics which will give specific information of certain actions. For example
page-load events can be composed of render time, data load time during the page-load and so on. It's important to understand these
events will have meaning for performance investigations and that can be used in visualizations, aggregations. Considering this,
creating an event for cpuUsage does not bring any value because it doesn't bring any context with itself and reporting multiple of these
events in different places of code will have so much variability during performance analysis of your code. However it can be nice attribute
to follow if it's important for you to look inside of a specific event e.g. `page-load`.
to follow if it's important for you to look inside of a specific event e.g. `page-load`.

- **Make sure that the event is clearly defined and consistent** (i.e. same code flow is executed each time).
Consider the start point and endpoint of the measurement and what happens between those points.
For example: a `app-data-load` event should not include the time it takes to render the data.
- **Choose event names wisely**.
Try to balance event names specificity. Calling an event `load` is too generic, calling an event `tsvb-data-load` is too specific (instead the visualization
Try to balance event names specificity. Calling an event `load` is too generic, calling an event `tsvb-data-load` is too specific (instead the visualization
type can be specified in a `meta` field)
- **Distinguish between flows with event context**.
- **Distinguish between flows with event context**.
If a function that loads data is called when an app loads, when the user changes filters and when the refresh button is clicked, you should distinguish between
these flows by specifying a `meta` field.
- **Avoid duplicate events**.
Make sure that measurement and reporting happens in a point of the code that is executed only once.
Make sure that measurement and reporting happens in a point of the code that is executed only once.
For example, make sure that refresh events are reported only once per button click.
- **Measure as close to the event as possible**.
For example, if you're measuring the execution of a specific React Effect execution, place the measurement code inside the effect.
try to place the measurement start right before the navigation is performed and stop measuring as soon as all resources are loaded
try to place the measurement start right before the navigation is performed and stop measuring as soon as all resources are loaded
- **Use the `window.performance` API**.
The [`performance.now()`](https://developer.mozilla.org/en-US/docs/Web/API/Performance/now) API can be used to accurate way to receive timestamps
The [`performance.mark()`](https://developer.mozilla.org/en-US/docs/Web/API/Performance/mark) API can be used to track performance without having to pollute the
The [`performance.now()`](https://developer.mozilla.org/en-US/docs/Web/API/Performance/now) API can be used to accurate way to receive timestamps
The [`performance.mark()`](https://developer.mozilla.org/en-US/docs/Web/API/Performance/mark) API can be used to track performance without having to pollute the
code.
- **Keep performance in mind**. Reporting the performance of Kibana should never harm its own performance.
- **Keep performance in mind**. Reporting the performance of Kibana should never harm its own performance.
Avoid sending events too frequently (`onMouseMove`) or adding serialized JSON objects (whole `SavedObjects`) into the meta object.

### Analyzing journey results

The telemetry data will be reported to the Telemetry Staging cluster alongside with execution context.
Use the `context.labels.ciBuildName` label to filter down events to only those originating from performance runs and visualize the duration of events (or their breakdowns):
- Be sure to narrow your analysis down to performance events by specifying a filter `context.labels.ciBuildName: kibana-single-user-performance`.
Otherwise you might be looking at results originating from different hardware.
- You can look at the results of a specific journey by filtering on `context.labels.journeyName`.

- Be sure to narrow your analysis down to performance events by specifying a filter `context.labels.ciBuildName: kibana-single-user-performance`.
Otherwise you might be looking at results originating from different hardware.
- You can look at the results of a specific journey by filtering on `context.labels.journeyName`.

Please contact the #kibana-performance team if you need more help visualizing and tracking the results.

Expand All @@ -181,7 +186,132 @@ Please contact the #kibana-performance team if you need more help visualizing an
All users who are opted in to report telemetry will start reporting event based telemetry as well.
The data is available to be analyzed on the production telemetry cluster.

# Report `kibana:plugin_render_time` metric event.

The metric `kibana:plugin_render_time` measures the time from the start of navigation to the point at which the most meaningful component appears on the screen.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question. Are we sure its the start of the navigation? Not the Kibana plugin start time ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we want is form the browser navigation start time. I am not sure if we are capturing that, If not we need to change the implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The performance mark start::pageChange is set on route change and the kibana:plugin_render_time is the duration between the start::pageChange and the end::pageReady.

Screenshot 2024-05-27 at 12 36 05

However, I've noticed that there are more than one markers for start::pageChange per route and the performance.measure gets the latest marker for start::pageChange. it seems to be due to re-render.

I believe I need to fix this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is other way to do this, Where you don't need to set marks on mount start/end. To truly measure based on the navigation time, we could do performance.measure("render-time)" this would get the metric between the Nav time and when the page is ready.

Thanks for checking, it seems as you mentioned we need a followup to fix this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could do performance.measure("render-time)"

Where should we do the performance.measure("render-time) ? I think I missed this part

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi the ticket: https://github.com/elastic/kibana/issues/184390

Feel free to update it


This metric helps assess the performance of web applications by indicating how quickly important content is rendered to the users . The definition of the "most meaningful component" varies across different pages, requiring custom instrumentation for each page to accurately measure it.
kpatticha marked this conversation as resolved.
Show resolved Hide resolved

### How it works

The `PerformanceContextProvider` utilizes the [browser's Performance API](https://developer.mozilla.org/en-US/docs/Web/API/Performance) to track and analyze the performance of page transitions within the app.
kpatticha marked this conversation as resolved.
Show resolved Hide resolved

1. Upon each page change, a performance marker named `start::pageChange` is set.
2. Once the `onPageReady` function is called and the browser has finished rendering, another marker named `end::pageReady` is established.
3. The duration between these two markers is measured in **milliseconds** and reported as the `eventName: kibana:plugin_render_time`
4. Report the data using `reportPerformanceMetricEvent` API.

### Instrument your code to report `kibana:plugin_render_time` metric event

To instrument `kibana:plugin_render_time`, you need to use the `PerformanceContextProvider` at the root of your application after the `Router` and run the `onPageReady` function once the data for the most meaningful component is fetched. The meaningful data can be one or more elements or pieces of information that define the core content of the page.

#### Code Example

app.js

```
import React from 'react';
import ReactDOM from 'react-dom';
import { PerformanceContextProvider } from '@kbn/ebt-tools';
import MyApp from './MyApp';

ReactDOM.render(
<Router>
<PerformanceContextProvider>
<MyApp />
</PerformanceContextProvider>,
</Router>
document.getElementById('root')
)

```

```
import React, { useEffect, useState } from 'react';
import { usePerformanceContext } from '@kbn/ebt-tools';


const MyApp = () => {
const { onPageReady } = usePerformanceContext();
const [data, setData] = useState(null);

useEffect(() => {
async function loadData() {
const fetchedData = await fetchData();
if (fetchedData.status === 'success') {
setData(fetchedData);

// Call onPageReady once the meaningful data has been fetched
kpatticha marked this conversation as resolved.
Show resolved Hide resolved

onPageReady();
}
}

loadData();
}, [onPageReady]);

if (!data) {
return <div>Loading...</div>;
}

return (
<div>
<h1>{data.title}</h1>
<p>{data.content}</p>
</div>
);
};

export default MyApp;


```

This event will be indexed with the following structure:

```typescript
{
"_index": "backing-ebt-kibana-browser-performance-metrics-000001", // Performance metrics are stored in a dedicated simplified index (browser \ server).
"_source": {
"timestamp": "2022-08-31T11:29:58.275Z"
"event_type": "performance_metric", // All performance events share a common event type to simplify mapping
"eventName": 'kibana:plugin_render_time', // Event name as specified when reporting it
"duration": 736, // Event duration as specified when reporting it
"meta": {
"target": '/home',
},
"context": { // Context holds information identifying the deployment, version, application and page that generated the event
"version": "8.5.0-SNAPSHOT",
"cluster_name": "elasticsearch",
"pageName": "application:home:app",
"applicationId": "home",
"page": "app",
"entityId": "61c58ad0-3dd3-11e8-b2b9-5d5dc1715159",
"branch": "main",
...
},
...
},
}
```

### Development environment

The metric will be delivered to the [Telemetry Staging](https://telemetry-v2-staging.elastic.dev/) cluster, alongside with the event's context.
The data is updated periodically, so you might have to wait up to 30 minutes to see your data in the index.

Once indexed, this metric will appear in `ebt-kibana` index. It is also mapped into an additional index, dedicated to performance metrics `ebt-kibana-browser-performance*`.

[Dashboard](<https://telemetry-v2-staging.elastic.dev/s/apm/app/dashboards#/view/f240fff6-fac9-491b-81d1-ac39006c5c94?_g=(filters:!(),refreshInterval:(pause:!t,value:60000),time:(from:now-15h,to:now))>)

### Production environment

All users who are opted in to report telemetry will start reporting event based telemetry as well.
The data is available to be analyzed on the production telemetry cluster.

[Dashboard](<https://stack-telemetry.elastic.dev/s/apm/app/dashboards#/view/f240fff6-fac9-491b-81d1-ac39006c5c94?_g=(filters:!(),refreshInterval:(pause:!t,value:60000),time:(from:now-15h,to:now))>)

# Analytics Client

Holds the public APIs to report events, enrich the events' context and set up the transport mechanisms. Please checkout package documentation to get more information about
[Analytics Client](https://github.com/elastic/kibana/blob/main/packages/analytics/README.md).
[Analytics Client](https://github.com/elastic/kibana/blob/main/packages/analytics/README.md).