Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitoring docs #209

Merged
merged 8 commits into from
Apr 5, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added docs/assets/images/grafana_dashboard.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/howtos/clusterautoscaling.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
layout: default
title: Cluster Autoscaling
parent: How to's
nav_order: 5
nav_order: 6
---

# Cluster Autoscaling
Expand Down
89 changes: 89 additions & 0 deletions docs/howtos/monitoring.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
---
layout: default
title: Monitoring
parent: How to's
nav_order: 5
---

# Monitoring

Thundernetes is able to export game server related metrics to a [Prometheus](https://prometheus.io/docs/introduction/overview/) server, and these can also be imported
to a [Grafana](https://grafana.com/docs/grafana/latest/introduction/) server for easy and intuitive visualizations.

[![Grafana Dasboard Example](../assets/images/grafana_dashboard.png)](../assets/images/grafana_dashboard.png)

Prometheus uses a pull model to retrieve data, and needs apps to implement an endpoint that responds to its HTTP requests. For this, Thundernetes exposes the following endpoints:

* **{controller manager service IP}:8080/metrics**
* **{nodeagent service IP}:56001/metrics**

## Install Thundernetes with Prometheus and Grafana

While it's possible to create and manage your own Prometheus and Grafana instances to consume the endpoints described above, it is also possible to install both into your K8s cluster using very few steps thanks to the [kube-prometheus](https://github.com/prometheus-operator/kube-prometheus) project. This will install the prometheus-operator and Grafana, and will connect them automatically. For this, follow these steps:

```bash
# clone the kube-prometheus repository
git clone https://github.com/prometheus-operator/kube-prometheus.git

cd prometheus-operator/

# install kube-prometheus' CRDs
kubectl create -f manifests/setup
until kubectl get servicemonitors --all-namespaces ; do date; sleep 1; echo ""; done
kubectl create -f manifests/
```

After this, you can install Thundernetes using the install files "with monitoring", these automatically create Prometheus ServiceMonitors that crawl the endpoints described above.

```bash
kubectl apply -f https://github.com/PlayFab/thundernetes/main/installfiles/operator_with_monitoring.yaml
```

## Check the data in Prometheus and Grafana

To test this, you can install the netcore Game Server Build sample, this is a basic application that uses GSDK to send information to Thundernetes. You should also allocate a server so you can see the data.

```bash
kubectl apply -f https://github.com/PlayFab/thundernetes/main/samples/netcore/sample-requestslimits.yaml
```
This will create a Game Server Build with 2 standby Game Servers, you can check they were successfully created like this:

```bash
# check the build
kubectl get gsb

# check the servers
kubectl get gs
```
To allocate a server, you need to have access to the thundernetes-controller-manager IP, in an AKS cluster you can get that like this:
```bash
export IP=$(kubectl get svc -n thundernetes-system thundernetes-controller-manager -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
```

Then allocate the server. The buildId must be the same than the one in the YAML file of the netcore sample, and the sessionId is used to identify the session:
```bash
curl -H 'Content-Type: application/json' -d '{"buildID":"85ffe8da-c82f-4035-86c5-9d2b5f42d6f6","sessionID":"ac1b7082-d811-47a7-89ae-fe1a9c48a6da"}' http://${IP}:5000/api/v1/allocate
```

You can check the data exported in Prometheus, to access the Prometheus instance in your cluster use port forwarding and open localhost:9090 in your browser:
```bash
kubectl --namespace monitoring port-forward svc/prometheus-k8s 9090
```

You can check the same data in Grafana and create a custom dashboard, to access the Grafana instance in your cluster use port forwarding and open localhost:3000 in your browser:
```bash
kubectl --namespace monitoring port-forward svc/grafana 3000
```
There is a custom Grafana dashboard example that visualizes some of this data in the [samples/grafana](https://github.com/PlayFab/thundernetes/tree/main/samples/grafana) directory.

## List of exported Prometheus metrics
| Metric name | Metric type | Source |
| --- | --- | --- |
| gameserver_states | Gauge | nodeagent |
| connected_players | Gauge | nodeagent |
| gameservers_current_state_per_build | Gauge | controller-manager |
| gameservers_created_total | Counter | controller-manager |
| gameservers_sessionended_total | Counter | controller-manager |
| gameservers_crashed_total | Counter | controller-manager |
| gameservers_deleted_total | Counter | controller-manager |
| allocations_total | Counter | controller-manager |
2 changes: 1 addition & 1 deletion docs/howtos/upgradebuild.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
layout: default
title: Upgrading your game server
parent: How to's
nav_order: 8
nav_order: 9
---

# Upgrading your game server
Expand Down
Binary file removed samples/grafana/ExampleDashboard.jpg
Binary file not shown.
2 changes: 1 addition & 1 deletion samples/grafana/readme.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Sample Grafana Dashboard

![example dashboard image](./ExampleDashboard.jpg)
[![Grafana Dasboard Example](../../docs/assets/images/grafana_dashboard.png)](../../docs/assets/images/grafana_dashboard.pnggrafana_dashboard.png)

The `dashboard.json` file in this folder can be imported to grafana to provide a simple example of retrieving metrics from the Thundernetes Controller. It makes the assumption that the Prometheus data source is named `Prometheus`, so you may need to edit the json file before importing if your Grafana installation uses a different name.