Skip to content

WIP Exclude managed kubernetes #42

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 23 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
cef8fe0
Builds kubernetes-mixin from a fork, instead of custom jsonnet
solsson Jan 20, 2022
bba03fb
The effect of skipping our custom mixin.libsonnet
solsson Jan 14, 2022
1512d98
With managed components commented out from upstream libsonnet files
solsson Jan 14, 2022
07eddf3
Dashboards with managed components commented out
solsson Jan 14, 2022
b184898
Hints about the "applyable" i.e. smaller bundle
solsson Jan 20, 2022
9190a00
We need these overrides for GKE, or at least some of them
solsson Jan 20, 2022
6195854
With overrides restored
solsson Jan 20, 2022
29c5d82
curl ksm:8081/metrics | grep -v '^#' | cut -d'{' -f1 | sort | uniq
solsson Jan 20, 2022
accae48
With sort | uniq -c
solsson Jan 20, 2022
33ce30f
A primitive way to see which metrics are used
solsson Jan 20, 2022
ad5407d
With this whitelist the scrape body is 26k lines instead of 122k
solsson Jan 20, 2022
b143336
Merge branch 'kube-state-metrics-whitelist' into exclude-managed-kube…
solsson Jan 20, 2022
cd760cd
Current Grafana
solsson Jan 21, 2022
88747d4
Removes accidentally deleted kustomization.yaml
solsson Jan 21, 2022
a08f245
I can't find this dashboard upstream now and we've never used it
solsson Jan 21, 2022
a9ba4b9
Adds alerts for prometheus-operator, such as reconcile errors
solsson Jan 21, 2022
6c50e23
A way to verify that we haven't removed essential alerts
solsson Jan 24, 2022
a11bd98
Merge remote-tracking branch 'origin/master' into exclude-managed-kub…
solsson Jul 1, 2022
2206485
Current grafana
solsson Jul 1, 2022
514cd43
kustomize openapi fetch a cluster with latest prometheus-operator
solsson Jul 5, 2022
f1dc062
Adds merge keys for Kustomize patchesStrategicMerge
solsson Jul 5, 2022
8cd39ee
Merge pull request #49 from Yolean/prometheusrule-strategic-merge
solsson Jul 5, 2022
14f918c
we no longer have this prefix for memory metrics, only for cpu
solsson Dec 2, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,8 +72,9 @@ How to avoid boilerplate?

## Apply the example monitoring stack

Assuming that `github.com/coreos/prometheus-operator/?ref=[a recent revision]` is already installed,
start from the example kustomize base:
Assuming that `github.com/coreos/prometheus-operator/?ref=[a recent revision]`
or `github.com/solsson/prometheus-operator/example/?ref=[a recent revision]`
is already installed, start from the example kustomize base:

```
kubectl apply -k example-small
Expand All @@ -97,8 +98,7 @@ A real stack might start from example-small and then:
This repo needs to have some generated content, where upstream kustomize bases could not be found

```
docker-compose -f docker-compose.test.yml build --no-cache kubernetes-mixin
docker-compose -f docker-compose.test.yml up --no-build kubernetes-mixin
./kubernetes-mixin-update.sh
```

## CI test suite
Expand Down
7 changes: 6 additions & 1 deletion base-label-prometheus-now/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -1,11 +1,16 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

# The resources to be scraped are versioned without the labels that Prometheus Operator matches on
# so the purpose of this kustomize base is to collect them and set a chosen label
commonLabels:
prometheus: now
bases:

resources:
- ../k8s
- ../node-exporter
- ../kube-state-metrics
- ../kubernetes-mixin
- ../assertions_failed
- ../scrape-annotations
- github.com/solsson/prometheus-operator/example/applyable-alerts/?ref=55ce571ef95b90d46257e469aa5d1885594fd4c1
46 changes: 1 addition & 45 deletions docker-compose.test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ services:
- agent
image: solsson/ystack-runner:cc6234b863b09ec9272c598d5a2431f9f11b0317@sha256:ffc2b8ac771af99c9a9dc4215f72285d33bdb318be58d2bb630e4e33266e69d8
environment:
- PROMETHEUS_OPERATOR_BASE=github.com/coreos/prometheus-operator/?ref=ec153e0a2007b1a569876e490cb036c30f5d707b
- PROMETHEUS_OPERATOR_BASE=github.com/solsson/prometheus-operator/example/?ref=00345ad49bc79802160df765a18a32913d18300a
- KUBECONFIG_WAIT=30
- KEEP_RUNNING=false
volumes:
Expand Down Expand Up @@ -81,50 +81,6 @@ services:
mem_limit: 80000000
memswap_limit: 0

kubernetes-mixin:
build: ./kubernetes-mixin
environment:
- DEBUG=true
entrypoint:
- sh
- -ce
command:
- |
cat << EOF > /kustomize-base/rules.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: kubernetes-mixin-rules
spec:
EOF
cat prometheus_rules.yaml | gojsontoyaml | sed 's|^| |' >> /kustomize-base/rules.yaml

cat << EOF > /kustomize-base/alerts.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: kubernetes-mixin-alerts
spec:
EOF
cat prometheus_alerts.yaml | gojsontoyaml | sed 's|^| |' >> /kustomize-base/alerts.yaml

cp -rv dashboards_out/* /dashboards-kustomize-base
rm /dashboards-kustomize-base/controller-manager.json
cat << EOF > /dashboards-kustomize-base/kustomization.yaml
# It should be possible to extend this set of dashboards using the "replace" behavior
# Meant to be used with kubectl create -k and kubectl replace -k
# (unless we learn how to get rid of last-applied-configuration with apply)
generatorOptions:
disableNameSuffixHash: true
configMapGenerator:
- name: kubernetes-mixin-grafana-dashboards
files:
EOF
find /dashboards-kustomize-base -name \*.json | sed 's|.*/\(.*\)| - \1=\1|' >> /dashboards-kustomize-base/kustomization.yaml
volumes:
- ./kubernetes-mixin:/kustomize-base
- ./kubernetes-mixin-dashboards:/dashboards-kustomize-base

volumes:
k3s-server: {}
admin: {}
8 changes: 6 additions & 2 deletions example-small/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -1,14 +1,18 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

namespace: monitoring
bases:

resources:
- ../rbac-prometheus
- ../base-label-prometheus-now
# Needs separate kubectl replace -k
#- ../kubernetes-mixin-dashboards
resources:
- main-alertmanager-service.yaml
- main-alertmanager.yaml
- now-prometheus-service.yaml
- now-prometheus.yaml

generatorOptions:
disableNameSuffixHash: true
secretGenerator:
Expand Down
2 changes: 1 addition & 1 deletion grafana/kustomization.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ namespace: monitoring
images:
- name: grafana/grafana
newName: docker.io/grafana/grafana
newTag: 8.5.1-ubuntu
newTag: 9.0.2-ubuntu
- name: grafana/grafana-image-renderer
newName: docker.io/grafana/grafana-image-renderer
newTag: 3.4.2
Expand Down
1 change: 1 addition & 0 deletions kube-state-metrics/kube-state-metrics-deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ spec:
- --port=8081
- --telemetry-host=0.0.0.0
- --telemetry-port=8082
- --metric-allowlist=kube_daemonset_status_current_number_scheduled,kube_daemonset_status_desired_number_scheduled,kube_daemonset_status_number_available,kube_daemonset_status_number_misscheduled,kube_deployment_metadata_generation,kube_deployment_spec_replicas,kube_deployment_status_observed_generation,kube_deployment_status_replicas,kube_deployment_status_replicas_available,kube_deployment_status_replicas_updated,kube_horizontalpodautoscaler_spec_max_replicas,kube_horizontalpodautoscaler_spec_min_replicas,kube_horizontalpodautoscaler_status_current_replicas,kube_horizontalpodautoscaler_status_desired_replicas,kube_job_failed,kube_job_spec_completions,kube_job_status_failed,kube_job_status_start_time,kube_job_status_succeeded,kube_namespace_status_phase,kube_node_info,kube_node_status_allocatable,kube_node_status_capacity,kube_persistentvolume_status_phase,kube_persistentvolumeclaim_access_mode,kube_persistentvolumeclaim_labels,kube_pod_container_resource_limits,kube_pod_container_resource_requests,kube_pod_container_status_last_terminated_reason,kube_pod_container_status_waiting,kube_pod_info,kube_pod_owner,kube_pod_status_phase,kube_replicaset_owner,kube_resourcequota,kube_statefulset_metadata_generation,kube_statefulset_replicas,kube_statefulset_status_current_revision,kube_statefulset_status_observed_generation,kube_statefulset_status_replicas,kube_statefulset_status_replicas_ready,kube_statefulset_status_replicas_updated,kube_statefulset_status_update_revision
image: k8s.gcr.io/kube-state-metrics/kube-state-metrics:v2.3.0
name: kube-state-metrics
ports:
Expand Down
18 changes: 18 additions & 0 deletions kube-state-metrics/whitelist/find.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#!/usr/bin/env bash
[ -z "$DEBUG" ] || set -x
set -e

# ./kube-state-metrics/whitelist/find.sh | grep -v '=0' | cut -d' ' -f1 | sort | uniq

DIR="$(dirname $0)"
BASE="$DIR/../.."

for search in \
$BASE/kubernetes-mixin/*.yaml \
$BASE/kubernetes-mixin-dashboards/*.json \
; do
for name in $(cat $DIR/sample_metric_names.txt | awk '{ print $2 }'); do
echo -n "$name $search ="
grep "$name" $search | wc -l | bc || true
done
done
205 changes: 205 additions & 0 deletions kube-state-metrics/whitelist/sample_metric_names.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,205 @@
2 kube_certificatesigningrequest_annotations
2 kube_certificatesigningrequest_cert_length
4 kube_certificatesigningrequest_condition
2 kube_certificatesigningrequest_created
2 kube_certificatesigningrequest_labels
341 kube_configmap_annotations
341 kube_configmap_created
341 kube_configmap_info
341 kube_configmap_labels
341 kube_configmap_metadata_resource_version
77 kube_cronjob_annotations
77 kube_cronjob_created
77 kube_cronjob_info
77 kube_cronjob_labels
77 kube_cronjob_metadata_resource_version
38 kube_cronjob_next_schedule_time
77 kube_cronjob_spec_failed_job_history_limit
77 kube_cronjob_spec_successful_job_history_limit
77 kube_cronjob_spec_suspend
77 kube_cronjob_status_active
43 kube_cronjob_status_last_schedule_time
9 kube_daemonset_annotations
9 kube_daemonset_created
9 kube_daemonset_labels
9 kube_daemonset_metadata_generation
9 kube_daemonset_status_current_number_scheduled
9 kube_daemonset_status_desired_number_scheduled
9 kube_daemonset_status_number_available
9 kube_daemonset_status_number_misscheduled
9 kube_daemonset_status_number_ready
9 kube_daemonset_status_number_unavailable
9 kube_daemonset_status_observed_generation
9 kube_daemonset_status_updated_number_scheduled
701 kube_deployment_annotations
701 kube_deployment_created
701 kube_deployment_labels
701 kube_deployment_metadata_generation
701 kube_deployment_spec_paused
701 kube_deployment_spec_replicas
676 kube_deployment_spec_strategy_rollingupdate_max_surge
676 kube_deployment_spec_strategy_rollingupdate_max_unavailable
4206 kube_deployment_status_condition
701 kube_deployment_status_observed_generation
701 kube_deployment_status_replicas
701 kube_deployment_status_replicas_available
701 kube_deployment_status_replicas_ready
701 kube_deployment_status_replicas_unavailable
701 kube_deployment_status_replicas_updated
930 kube_endpoint_address_available
930 kube_endpoint_address_not_ready
930 kube_endpoint_annotations
930 kube_endpoint_created
930 kube_endpoint_info
930 kube_endpoint_labels
650 kube_endpoint_ports
7 kube_horizontalpodautoscaler_annotations
7 kube_horizontalpodautoscaler_info
7 kube_horizontalpodautoscaler_labels
7 kube_horizontalpodautoscaler_metadata_generation
7 kube_horizontalpodautoscaler_spec_max_replicas
7 kube_horizontalpodautoscaler_spec_min_replicas
7 kube_horizontalpodautoscaler_spec_target_metric
63 kube_horizontalpodautoscaler_status_condition
7 kube_horizontalpodautoscaler_status_current_replicas
7 kube_horizontalpodautoscaler_status_desired_replicas
259 kube_job_annotations
708 kube_job_complete
259 kube_job_created
60 kube_job_failed
259 kube_job_info
259 kube_job_labels
259 kube_job_owner
3 kube_job_spec_active_deadline_seconds
259 kube_job_spec_completions
259 kube_job_spec_parallelism
259 kube_job_status_active
236 kube_job_status_completion_time
296 kube_job_status_failed
259 kube_job_status_start_time
259 kube_job_status_succeeded
29 kube_lease_owner
29 kube_lease_renew_time
3 kube_limitrange
2 kube_limitrange_created
7 kube_mutatingwebhookconfiguration_created
7 kube_mutatingwebhookconfiguration_info
7 kube_mutatingwebhookconfiguration_metadata_resource_version
57 kube_namespace_annotations
57 kube_namespace_created
57 kube_namespace_labels
114 kube_namespace_status_phase
24 kube_node_annotations
24 kube_node_created
24 kube_node_info
24 kube_node_labels
1 kube_node_spec_taint
24 kube_node_spec_unschedulable
168 kube_node_status_allocatable
168 kube_node_status_capacity
843 kube_node_status_condition
64 kube_persistentvolume_annotations
64 kube_persistentvolume_capacity_bytes
64 kube_persistentvolume_claim_ref
64 kube_persistentvolume_info
64 kube_persistentvolume_labels
320 kube_persistentvolume_status_phase
68 kube_persistentvolumeclaim_access_mode
68 kube_persistentvolumeclaim_annotations
68 kube_persistentvolumeclaim_info
68 kube_persistentvolumeclaim_labels
68 kube_persistentvolumeclaim_resource_requests_storage_bytes
204 kube_persistentvolumeclaim_status_phase
1154 kube_pod_annotations
95 kube_pod_completion_time
1726 kube_pod_container_info
1848 kube_pod_container_resource_limits
2062 kube_pod_container_resource_requests
1726 kube_pod_container_state_started
73 kube_pod_container_status_last_terminated_reason
1726 kube_pod_container_status_ready
1726 kube_pod_container_status_restarts_total
1726 kube_pod_container_status_running
1726 kube_pod_container_status_terminated
107 kube_pod_container_status_terminated_reason
1726 kube_pod_container_status_waiting
1154 kube_pod_created
1154 kube_pod_info
207 kube_pod_init_container_info
10 kube_pod_init_container_resource_limits
10 kube_pod_init_container_resource_requests
207 kube_pod_init_container_status_ready
207 kube_pod_init_container_status_restarts_total
207 kube_pod_init_container_status_running
207 kube_pod_init_container_status_terminated
207 kube_pod_init_container_status_terminated_reason
207 kube_pod_init_container_status_waiting
1154 kube_pod_labels
1154 kube_pod_owner
1154 kube_pod_restart_policy
41 kube_pod_spec_volumes_persistentvolumeclaims_info
41 kube_pod_spec_volumes_persistentvolumeclaims_readonly
1154 kube_pod_start_time
5770 kube_pod_status_phase
3255 kube_pod_status_ready
5770 kube_pod_status_reason
3255 kube_pod_status_scheduled
1085 kube_pod_status_scheduled_time
6 kube_poddisruptionbudget_annotations
6 kube_poddisruptionbudget_created
6 kube_poddisruptionbudget_labels
6 kube_poddisruptionbudget_status_current_healthy
6 kube_poddisruptionbudget_status_desired_healthy
6 kube_poddisruptionbudget_status_expected_pods
6 kube_poddisruptionbudget_status_observed_generation
6 kube_poddisruptionbudget_status_pod_disruptions_allowed
3704 kube_replicaset_annotations
3704 kube_replicaset_created
3704 kube_replicaset_labels
3704 kube_replicaset_metadata_generation
3704 kube_replicaset_owner
3704 kube_replicaset_spec_replicas
3704 kube_replicaset_status_fully_labeled_replicas
3704 kube_replicaset_status_observed_generation
3704 kube_replicaset_status_ready_replicas
3704 kube_replicaset_status_replicas
572 kube_resourcequota
58 kube_resourcequota_created
676 kube_secret_annotations
676 kube_secret_created
676 kube_secret_info
676 kube_secret_labels
676 kube_secret_metadata_resource_version
676 kube_secret_type
962 kube_service_annotations
962 kube_service_created
962 kube_service_info
962 kube_service_labels
962 kube_service_spec_type
1 kube_service_status_load_balancer_ingress
28 kube_statefulset_annotations
28 kube_statefulset_created
28 kube_statefulset_labels
28 kube_statefulset_metadata_generation
28 kube_statefulset_replicas
28 kube_statefulset_status_current_revision
28 kube_statefulset_status_observed_generation
28 kube_statefulset_status_replicas
28 kube_statefulset_status_replicas_available
28 kube_statefulset_status_replicas_current
28 kube_statefulset_status_replicas_ready
28 kube_statefulset_status_replicas_updated
28 kube_statefulset_status_update_revision
13 kube_storageclass_annotations
13 kube_storageclass_created
13 kube_storageclass_info
13 kube_storageclass_labels
10 kube_validatingwebhookconfiguration_created
10 kube_validatingwebhookconfiguration_info
10 kube_validatingwebhookconfiguration_metadata_resource_version
4 kube_volumeattachment_created
4 kube_volumeattachment_info
4 kube_volumeattachment_labels
4 kube_volumeattachment_spec_source_persistentvolume
4 kube_volumeattachment_status_attached
4 kube_volumeattachment_status_attachment_metadata
Loading