Skip to content

digitalis-io/terraform-module-monitoring

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Terraform Kubernetes Monitoring Module

A comprehensive Terraform module for deploying a production-ready monitoring and observability stack on Kubernetes clusters. This module provides a complete solution for metrics, logs, and traces collection with long-term storage capabilities.

Features

  • Complete Observability Stack: Deploy Prometheus, Grafana, Loki, Alloy, and OpenTelemetry in a single module
  • Long-term Metrics Storage: Grafana Mimir for scalable, multi-tenant Prometheus backend
  • Log Aggregation: Distributed Loki deployment for centralized log management
  • Log Collection: Grafana Alloy for efficient log collection and forwarding to Loki
  • Trace Collection: OpenTelemetry Operator for distributed tracing
  • Certificate Management: Automated TLS certificate handling with cert-manager
  • DNS Management: External-DNS for automatic DNS record creation
  • Modular Design: Enable/disable individual components based on your needs
  • Production Ready: Persistent storage, high availability configurations, and proper resource limits
  • Pre-configured Integration: Components are automatically integrated with proper data sources

Usage Example

Basic Usage

module "monitoring" {
  source = "path/to/terraform-module-monitoring"

  # Deploy the full monitoring stack
  prometheus = {
    enabled = true
  }
  
  grafana = {
    enabled = true
  }
  
  loki = {
    enabled = true
  }
  
  alloy = {
    enabled = true
  }
  
  namespace = "monitoring"
}

Advanced Configuration

module "monitoring" {
  source = "path/to/terraform-module-monitoring"
  
  # Kubernetes configuration
  kube_context = "my-cluster"
  kubeconfig   = "~/.kube/config"
  namespace    = "observability"
  
  # Enable all components with custom configuration
  external_dns = {
    enabled   = true
    version   = "1.17.0"
    namespace = "external-dns"
  }
  
  cert_manager = {
    enabled   = true
    version   = "v1.18.2"
    namespace = "cert-manager"
  }
  
  prometheus = {
    enabled = true
    version = "75.9.0"
    name    = "prometheus"
    chart   = "kube-prometheus-stack"
  }
  
  grafana_mimir = {
    enabled   = true
    version   = "5.7.0"
    namespace = "monitoring"
  }
  
  loki = {
    enabled   = true
    version   = "0.79.3"
    namespace = "monitoring"
  }
  
  alloy = {
    enabled   = true
    version   = "0.10.0"
    namespace = "monitoring"
  }
  
  opentelemetry = {
    enabled = true
    version = "0.90.4"
  }
  
  grafana = {
    enabled = true
    version = "12.0.2"
  }
}

Local Development with K3s

The module includes a K3s-based local development environment:

# Start the local K3s cluster
cd k3s
./start.sh

# Apply the monitoring stack
terraform init
terraform apply -var-file="k3s/terraform.tfvars"

# Access services (using nip.io for local DNS)
# Prometheus: http://prometheus.127.0.0.1.nip.io
# Grafana: http://grafana.127.0.0.1.nip.io

# Cleanup
./destroy.sh

Requirements

Name Version
terraform >= 1.0.0
helm >= 2.0.0
kubernetes >= 2.0.0

Providers

Name Version
helm >= 2.0.0
kubernetes >= 2.0.0

Inputs

Name Type Default Description
kube_context string null The Kubernetes context to use
kubeconfig string "~/.kube/config" Path to the kubernetes config file
namespace string "monitoring" The default Kubernetes namespace where resources will be installed
external_dns any See below External-DNS Helm chart configuration
cert_manager any See below Cert-Manager Helm chart configuration with self-signed cluster issuer
prometheus any See below Prometheus Helm chart configuration with ingress enabled
grafana_mimir any See below Grafana Mimir (distributed Prometheus backend) Helm chart configuration
loki any See below Loki distributed Helm chart configuration for log aggregation
alloy any See below Grafana Alloy Helm chart configuration for log collection and forwarding
opentelemetry any See below OpenTelemetry Helm chart configuration
grafana any See below Grafana Helm chart configuration with pre-configured data sources

Component Configuration Defaults

Each component can be configured with the following structure:

{
  enabled    = bool   # Whether to install this component
  version    = string # Helm chart version
  name       = string # Helm release name
  chart      = string # Helm chart name
  namespace  = string # Kubernetes namespace (optional, uses module namespace if not specified)
  repository = string # Helm chart repository URL
}

Default configurations:

  • external_dns: enabled = true, version = "1.17.0"
  • cert_manager: enabled = true, version = "v1.18.2"
  • prometheus: enabled = true, version = "75.9.0", using kube-prometheus-stack
  • grafana_mimir: enabled = true, version = "5.7.0", using mimir-distributed
  • loki: enabled = true, version = "0.79.3", using loki-distributed
  • alloy: enabled = true, version = "0.10.0"
  • opentelemetry: enabled = true, version = "0.90.4"
  • grafana: enabled = false, version = "12.0.2"

Outputs

Currently, this module does not expose any outputs. Future versions may include:

  • Service URLs for Prometheus, Grafana, and other components
  • Installation status for each component
  • Generated passwords and credentials

Architecture

The module deploys the following architecture:

┌─────────────────────────────────────────────────────────────┐
│                     Kubernetes Cluster                       │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  ┌─────────────┐  ┌──────────────┐  ┌─────────────────┐  │
│  │ Prometheus  │  │   Grafana    │  │  External-DNS   │  │
│  │   Stack     │  │              │  │                 │  │
│  └──────┬──────┘  └──────┬───────┘  └─────────────────┘  │
│         │                 │                                 │
│         ▼                 ▼                                 │
│  ┌─────────────┐  ┌──────────────┐  ┌─────────────────┐  │
│  │   Grafana   │  │     Loki     │  │  Cert-Manager   │  │
│  │    Mimir    │  │ (Distributed)│  │                 │  │
│  └─────────────┘  └──────┬───────┘  └─────────────────┘  │
│                           │                                 │
│                    ┌──────▼───────┐                        │
│                    │    Alloy     │                        │
│                    │ (DaemonSet)  │                        │
│                    └──────────────┘                        │
│                                                             │
│  ┌───────────────────────┐  ┌───────────────────────┐  │
│  │  OpenTelemetry      │  │   Jaeger Operator   │  │
│  │     Operator        │  │                     │  │
│  └───────────────────────┘  └───────────────────────┘  │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Data Flow

  1. Metrics: Prometheus scrapes metrics → Remote writes to Grafana Mimir for long-term storage
  2. Logs: Applications → Alloy (DaemonSet) → Loki → Grafana for visualization
  3. Traces: Applications → OpenTelemetry Collector → Storage backend
  4. Visualization: Grafana provides unified dashboards for all telemetry data

Component Details

Prometheus Stack (kube-prometheus-stack)

  • Full Prometheus Operator deployment
  • Pre-configured ServiceMonitors for Kubernetes components
  • AlertManager for alert routing
  • Prometheus server with remote write to Mimir

Grafana Mimir

  • Horizontally scalable, multi-tenant Prometheus backend
  • Long-term metrics storage
  • Compatible with Prometheus remote write API
  • Includes MinIO for object storage (optional)

Loki

  • Distributed deployment for high availability
  • Efficient log aggregation and storage
  • Integrated with Grafana for log exploration
  • Configured with appropriate storage classes

OpenTelemetry

  • Operator pattern for managing OpenTelemetry collectors
  • Support for traces, metrics, and logs
  • Auto-instrumentation capabilities
  • Kubernetes-native resource management

Grafana

  • Pre-configured data sources for Prometheus, Mimir, and Loki
  • Dashboard provisioning for common use cases
  • RBAC and authentication support
  • Ingress configuration for web access

Alloy

  • DaemonSet deployment for log collection from all nodes
  • Kubernetes service discovery for automatic pod log collection
  • Pre-configured pipeline to forward logs to Loki
  • Efficient log processing with static labels and filtering
  • Container runtime agnostic (Docker, containerd, CRI-O)

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

About Us

At Digitalis, our mission is to make the adoption of cloud-native and distributed data technologies as easy and seamless as possible for enterprises—on any Kubernetes, any cloud, and any data center. We focus on the technology stack that powers modern businesses, knowing this area can create a significant impact for our customers. If your organization is considering these technologies to drive transformation, we're here to guide you every step of the way.

Contact our team for a free consultation to discuss how we can tailor our approach to your specific needs and challenges.

License

This module is licensed under the MIT License - see the LICENSE file for details.

About

Terraform module that install a full suite of monitoring into a Kubernetes cluster

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published