Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add alertmanager sink #107

Merged
merged 16 commits into from
Nov 15, 2023
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .vscode/launch.json
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,6 @@
// "KUBECONFIG": "",
"KUBEBUILDER_ASSETS": "/Users/tylergillson/spectrocloud/repos/oss/spectrocloud-labs/validation/validator/bin/k8s/1.27.1-darwin-arm64"
}
},
},
]
}
80 changes: 76 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,22 +23,94 @@ Plugins:

## Installation
```bash
helm repo add validator https://spectrocloud-labs.github.io/validator/
helm repo add validator https://spectrocloud-labs.github.io/validator
helm repo update
helm install validator validator/validator -n validator --create-namespace
```

## Sinks
Validator can be configured to emit updates to various event sinks whenever a `ValidationResult` is created or updated. See configuration details below for each supported sink.

### Alertmanager
Integrate with the Alertmanager API to emit alerts to all [supported Alertmanager receivers](https://prometheus.io/docs/alerting/latest/configuration/#receiver-integration-settings), including generic webhooks. The only required configuration is an Alertmanager endpoint. HTTP basic authentication and TLS are also supported. See [values.yaml](https://github.com/spectrocloud-labs/validator/blob/main/chart/validator/values.yaml) for configuration details.

#### Sample Output
![Screen Shot 2023-11-15 at 10 42 20 AM](https://github.com/spectrocloud-labs/validator/assets/1795270/ce958b8e-96d7-4f5e-8efc-80e2fc2b0b4d)

#### Setup
1. Install Alertmanager in your cluster (if it isn't installed already)
2. Configure Alertmanager alert content. Alerts can be formatted/customized via the following labels and annotations:

Labels
- alertname
- plugin
- validation_result
- expected_results

Annotations
- state
- validation_rule
- validation_type
- message
- status
- detail
- pipe-delimited array of detail messages, see sample config for parsing example
- failure (also pipe-delimited)
- last_validation_time

Example Alertmanager ConfigMap used to produce the sample output above:
```yaml
apiVersion: v1
data:
alertmanager.yml: |
global:
slack_api_url: https://slack.com/api/chat.postMessage
receivers:
- name: default-receiver
slack_configs:
- channel: <channel-id>
text: |-
{{ range .Alerts.Firing -}}
*Validation Result: {{ .Labels.validation_result }}/{{ .Labels.expected_results }}*

{{ range $k, $v := .Annotations }}
{{- if $v }}*{{ $k | title }}*:
{{- if match "\\|" $v }}
- {{ reReplaceAll "\\|" "\n- " $v -}}
{{- else }}
{{- printf " %s" $v -}}
{{- end }}
{{- end }}
{{ end }}

{{ end }}
title: "{{ (index .Alerts 0).Labels.plugin }}: {{ (index .Alerts 0).Labels.alertname }}\n"
http_config:
authorization:
credentials: xoxb--<bot>-<token>
send_resolved: false
route:
group_interval: 10s
group_wait: 10s
receiver: default-receiver
repeat_interval: 1h
templates:
- /etc/alertmanager/*.tmpl
kind: ConfigMap
metadata:
name: alertmanager
namespace: alertmanager
```

2. Install validator and/or upgrade your validator Helm release, configuring `values.sink` accordingly.

### Slack

#### Sample Output
<img width="704" alt="Screen Shot 2023-11-10 at 4 30 12 PM" src="https://github.com/spectrocloud-labs/validator/assets/1795270/c011143a-4d4b-4299-b88b-699188f4bda2">
<img width="700" alt="Screen Shot 2023-11-10 at 4 18 22 PM" src="https://github.com/spectrocloud-labs/validator/assets/1795270/9f2c4ab7-34d6-496a-9f60-68655a7ee3d6">

#### Setup

1. Go to https://api.slack.com/apps and click **Create New App**, then select **From scratch**. Pick an App Name and Slack Workspace, then click **Create App**.

<img src="https://github.com/spectrocloud-labs/validator/assets/1795270/58cbb5a0-12a4-4a83-a0dd-20ae87a8105d" width="500">
Expand All @@ -53,8 +125,8 @@ Validator can be configured to emit updates to various event sinks whenever a `V

4. Install validator and/or upgrade your validator Helm release, configuring `values.sink` accordingly.

## Getting Started
You’ll need a Kubernetes cluster to run against. You can use [KIND](https://sigs.k8s.io/kind) to get a local cluster for testing, or run against a remote cluster.
## Development
You’ll need a Kubernetes cluster to run against. You can use [kind](https://sigs.k8s.io/kind) to get a local cluster for testing, or run against a remote cluster.
**Note:** Your controller will automatically use the current context in your kubeconfig file (i.e. whatever cluster `kubectl cluster-info` shows).

### Running on the cluster
Expand Down
2 changes: 1 addition & 1 deletion api/v1alpha1/validatorconfig_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ type ValidatorConfigSpec struct {
}

type Sink struct {
// +kubebuilder:validation:Enum=slack
// +kubebuilder:validation:Enum=alertmanager;slack
Type string `json:"type"`
// Name of a K8s secret containing configuration details for the sink
SecretName string `json:"secretName"`
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ spec:
type: string
type:
enum:
- alertmanager
- slack
type: string
required:
Expand Down
7 changes: 6 additions & 1 deletion chart/validator/templates/sink-secret.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,12 @@ kind: Secret
metadata:
name: {{ required ".Values.sink.secretName is required!" .Values.sink.secretName }}
stringData:
{{- if eq .Values.sink.type "slack" }}
{{- if eq .Values.sink.type "alertmanager" }}
endpoint: {{ required ".Values.sink.endpoint is required!" .Values.sink.endpoint }}
caCert: {{ .Values.sink.caCert }}
username: {{ .Values.sink.username }}
password: {{ .Values.sink.password }}
{{- else if eq .Values.sink.type "slack" }}
apiToken: {{ required ".Values.sink.apiToken is required!" .Values.sink.apiToken }}
channelId: {{ required ".Values.sink.channelId is required!" .Values.sink.channelId }}
{{- end }}
Expand Down
11 changes: 10 additions & 1 deletion chart/validator/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -54,10 +54,19 @@ metricsService:

# Optional sink configuration
sink: {}
# type: alertmanager
# secretName: alertmanager-sink-secret
# endpoint: "http://alertmanager.alertmanager.svc.cluster.local:9093"
# caCert: "" # (TLS CA certificate, optional)
# username: "" # (HTTP basic auth, optional)
# password: "" # (HTTP basic auth, optional)

# OR
# type: slack
# secretName: "slack-secret"
# secretName: slack-sink-secret
# apiToken: ""
# channelId: ""

# By default, a secret will be created. Leave the above fields blank and specify 'createSecret: false' to use an existing secret.
# WARNING: the existing secret must match the format used in sink-secret.yaml
# createSecret: true
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ spec:
type: string
type:
enum:
- alertmanager
- slack
type: string
required:
Expand Down
4 changes: 2 additions & 2 deletions internal/controller/validationresult_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@
sinkConfig = sinkSecret.Data
}

if err := sink.Configure(*r.SinkClient, *vc, sinkConfig); err != nil {
if err := sink.Configure(*r.SinkClient, sinkConfig); err != nil {

Check warning on line 135 in internal/controller/validationresult_controller.go

View check run for this annotation

Codecov / codecov/patch

internal/controller/validationresult_controller.go#L135

Added line #L135 was not covered by tests
r.Log.Error(err, "failed to configure sink")
return ctrl.Result{}, err
}
Expand Down Expand Up @@ -177,7 +177,7 @@
vr.Status.SinkState = sinkState

if err := r.Status().Update(context.Background(), vr); err != nil {
r.Log.V(0).Error(err, "failed to update ValidationResult status")
r.Log.V(1).Info("warning: failed to update ValidationResult status", "error", err.Error())

Check warning on line 180 in internal/controller/validationresult_controller.go

View check run for this annotation

Codecov / codecov/patch

internal/controller/validationresult_controller.go#L180

Added line #L180 was not covered by tests
return err
}

Expand Down
160 changes: 160 additions & 0 deletions internal/sinks/alertmanager.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
package sinks

import (
"bytes"
"crypto/tls"
"crypto/x509"
"encoding/base64"
"encoding/json"
"fmt"
"net/http"
"net/url"
"strconv"
"strings"

"github.com/go-logr/logr"
"github.com/pkg/errors"

"github.com/spectrocloud-labs/validator/api/v1alpha1"
)

type AlertmanagerSink struct {
client Client
log logr.Logger

endpoint string
username string
password string
}

type Alert struct {
Annotations map[string]string `json:"annotations"`
Labels map[string]string `json:"labels"`
}

var (
InvalidEndpoint = errors.New("invalid Alertmanager config: endpoint scheme and host are required")
EndpointRequired = errors.New("invalid Alertmanager config: endpoint required")
)

func (s *AlertmanagerSink) Configure(c Client, config map[string][]byte) error {
// endpoint
endpoint, ok := config["endpoint"]
if !ok {
return EndpointRequired
}
u, err := url.Parse(string(endpoint))
if err != nil {
return errors.Wrap(err, "invalid Alertmanager config: failed to parse endpoint")
}

Check warning on line 49 in internal/sinks/alertmanager.go

View check run for this annotation

Codecov / codecov/patch

internal/sinks/alertmanager.go#L48-L49

Added lines #L48 - L49 were not covered by tests
if u.Scheme == "" || u.Host == "" {
return InvalidEndpoint
}
if u.Path != "" {
u.Path = ""
ahmad-ibra marked this conversation as resolved.
Show resolved Hide resolved
}
s.endpoint = fmt.Sprintf("%s/api/v2/alerts", u.String())

// basic auth
s.username = string(config["username"])
s.password = string(config["password"])

// tls
var caCertPool *x509.CertPool
var insecureSkipVerify bool

insecure, ok := config["insecureSkipVerify"]
if ok {
insecureSkipVerify, err = strconv.ParseBool(string(insecure))
if err != nil {
return errors.Wrap(err, "invalid Alertmanager config: failed to parse insecureSkipVerify")
}
}
caCert, ok := config["caCert"]
if ok {
caCertPool, err = x509.SystemCertPool()
if err != nil {
return errors.Wrap(err, "invalid Alertmanager config: failed to get system cert pool")
}

Check warning on line 78 in internal/sinks/alertmanager.go

View check run for this annotation

Codecov / codecov/patch

internal/sinks/alertmanager.go#L77-L78

Added lines #L77 - L78 were not covered by tests
caCertPool.AppendCertsFromPEM(caCert)
}

c.hclient.Transport = &http.Transport{
TLSClientConfig: &tls.Config{
InsecureSkipVerify: insecureSkipVerify,
MinVersion: tls.VersionTLS12,
RootCAs: caCertPool,
},
}
s.client = c

return nil
}

func (s *AlertmanagerSink) Emit(r v1alpha1.ValidationResult) error {
alerts := make([]Alert, 0, len(r.Status.Conditions))

for i, c := range r.Status.Conditions {
alerts = append(alerts, Alert{
Labels: map[string]string{
"alertname": r.Name,
"plugin": r.Spec.Plugin,
"validation_result": strconv.Itoa(i + 1),
"expected_results": strconv.Itoa(r.Spec.ExpectedResults),
},
Annotations: map[string]string{
"state": string(r.Status.State),
"validation_rule": c.ValidationRule,
"validation_type": c.ValidationType,
"message": c.Message,
"status": string(c.Status),
"detail": strings.Join(c.Details, "|"),
"failure": strings.Join(c.Failures, "|"),
"last_validation_time": c.LastValidationTime.String(),
},
})
}

body, err := json.Marshal(alerts)
if err != nil {
s.log.Error(err, "failed to marshal alerts", "alerts", alerts)
return err
}

Check warning on line 122 in internal/sinks/alertmanager.go

View check run for this annotation

Codecov / codecov/patch

internal/sinks/alertmanager.go#L120-L122

Added lines #L120 - L122 were not covered by tests
s.log.V(1).Info("Alertmanager message", "payload", body)

req, err := http.NewRequest(http.MethodPost, s.endpoint, bytes.NewReader(body))
if err != nil {
s.log.Error(err, "failed to create HTTP POST request", "endpoint", s.endpoint)
return err
}

Check warning on line 129 in internal/sinks/alertmanager.go

View check run for this annotation

Codecov / codecov/patch

internal/sinks/alertmanager.go#L127-L129

Added lines #L127 - L129 were not covered by tests
req.Header.Add("Content-Type", "application/json")

if s.username != "" && s.password != "" {
req.Header.Add(basicAuthHeader(s.username, s.password))
}

Check warning on line 134 in internal/sinks/alertmanager.go

View check run for this annotation

Codecov / codecov/patch

internal/sinks/alertmanager.go#L133-L134

Added lines #L133 - L134 were not covered by tests

resp, err := s.client.hclient.Do(req)
defer func() {
if resp != nil {
_ = resp.Body.Close()
}
}()
if err != nil {
s.log.Error(err, "failed to post alert", "endpoint", s.endpoint)
return err
}

Check warning on line 145 in internal/sinks/alertmanager.go

View check run for this annotation

Codecov / codecov/patch

internal/sinks/alertmanager.go#L143-L145

Added lines #L143 - L145 were not covered by tests
if resp.StatusCode != 200 {
s.log.V(0).Info("failed to post alert", "endpoint", s.endpoint, "status", resp.Status, "code", resp.StatusCode)
return SinkEmissionFailed
}

s.log.V(0).Info("Successfully posted alert to Alertmanager", "endpoint", s.endpoint, "status", resp.Status, "code", resp.StatusCode)
return nil
}

func basicAuthHeader(username, password string) (string, string) {
auth := base64.StdEncoding.EncodeToString(
bytes.Join([][]byte{[]byte(username), []byte(password)}, []byte(":")),
)
return "Authorization", fmt.Sprintf("Basic %s", auth)
}
Loading