The following will describe how to install a servicemonitor and have it point to your app. The repo contains a basic go app and a basic service monitor implementation.
To enable user defined cluster monitoring you need to create a config map with the value: An example is located /extern_resource/...
enableUserWorkload: true
You would need to deploy that config or update one there.
oc edit configmap cluster-monitoring-config -n openshift-monitoring
Make sure you see the user worloads running
oc get pod -n openshift-user-workload-monitoring
I put in a sample go app that creates a metric based upon a timer. This just toggles a value between 0 and 1 but it gives a start. The deployment files are available in kustomization/... There is also a Dockerfile if you want.
For now you can view kustomziation/servicemonitor.yaml on what a file should look like.
oc apply -f servicemonitor.yaml
A quick review of ServiceMonitor
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: prometheus-example-monitor
namespace: servicemonitor-a
spec:
endpoints:
- interval: 30s
port: web <- points to svc port name
scheme: http
selector:
matchLabels:
app: prometheus-example-app <- points to svc label
Maps to Service
apiVersion: v1
kind: Service
metadata:
name: prometheus-example-app
namespace: servicemonitor-a
labels:
app: prometheus-example-app <- this label
spec:
ports:
- port: 8080
targetPort: 8080
protocol: TCP
name: web <- this port name
selector:
app: prometheus-example-app
type: ClusterIP
Here are some things i've seen be problems with initial setups of a ServiceMonitor...
As mentioned above check that the servicemonitor points to the service and not the deployment/pod.
oc get service prometheus-example-app -n servicemonitor-a --show-labels
oc get service prometheus-example-app -n servicemonitor-a -o yaml
In OCP, you should see a Target under Admin -> Observe -> Targets.
You can also query the targets in prom by querying api/v1/targets. You should see a 0 in droppedCounters but more importantly you should see your service being queried. If it is in a dropped counter it is probably not able to find the service.
PROM_POD=$(oc get pods -n openshift-user-workload-monitoring -l app.kubernetes.io/name=prometheus -o name | head -1)
oc port-forward -n openshift-user-workload-monitoring $PROM_POD 9090:9090
curl localhost:9090/api/v1/targets
Querying thanos can also help you identify what may be happening.
THANOS_QUERIER_HOST=$(oc -n openshift-monitoring get route thanos-querier -o jsonpath='{.spec.host}')
curl -k -H "Authorization: Bearer $(oc whoami -t)" "https://$THANOS_QUERIER_HOST/api/v1/query?query=my_app_up"
curl -k -H "Authorization: Bearer $(oc whoami -t)" "https://$THANOS_QUERIER_HOST/api/v1/query?query=my_app_requests_total"
Checking the logs... your pod names will be different then mine.
oc logs -n openshift-user-workload-monitoring prometheus-operator-7c7b77cbff-7v7km -c prometheus-operator
First the deployment app should be making metrics available at /metrics and be available ideally done by a prom client.
2.) The ServiceMonitor uses label selectors which point to a service.
3.)you should be able to query the /metrics and get data back from the app.
4.) in the user workload pods I should see something like discovery manager scrape referencing the service monitor I deployed.
5.) Openshift metrics should show target as up and running
6.) if I portforward the openshift promethus user workload pods to port 9090:9090 I can query that prom endpoint /api/v1/targets and I should see my endpoint in there. If it is under the droppedTargetCounts I know that while the ServiceMonitor is configured it is unable to reach that endpoint or getting an error along the way
Missing service labels Mismatched label selectors Wrong namespace configuration Port name mismatches Metrics endpoint accessibility