Aller au contenu

Monitoring Overview

Stack

The monitoring stack runs as Docker containers in the monitoring network on VPS #1.

Component Version Role Port
Prometheus v3.10 Metrics aggregation, alerting rules 9090
Grafana v12.4 Dashboards, unified alerting 3000 (proxied via nginx)
Loki v3.6 Log aggregation 3100
Alloy v1.14 Log collector (Docker auto-discovery)
node-exporter v1.10 Host system metrics (CPU, memory, disk, network) 9100
cAdvisor v0.55 Container metrics (CPU, memory, I/O) 8080
postgres-exporter v0.19 PostgreSQL metrics 9187
redis-exporter v1.82 Redis metrics 9121
nginx-exporter v1.4 Nginx reverse proxy metrics 9113
celery-exporter v0.10 Celery task queue metrics 9808
blackbox-exporter v0.26 External probes (SSL, HTTP health) 9115

Architecture

                              ┌──────────────┐
                              │   Grafana    │ ← dashboards + alerts
                              │  :3000       │
                              └──┬───────┬───┘
                                 │       │
                    ┌────────────┘       └────────────┐
                    ▼                                  ▼
             ┌──────────────┐                  ┌──────────────┐
             │  Prometheus  │                  │     Loki     │
             │  :9090       │                  │  :3100       │
             └──────┬───────┘                  └──────┬───────┘
                    │                                  │
        ┌───────────┼───────────┐                      │
        ▼           ▼           ▼                      ▼
   exporters   blackbox    cAdvisor              ┌──────────────┐
   (node,      (SSL +      (container            │    Alloy     │
    pg, redis,  HTTP        metrics)              │  (log shim)  │
    nginx,      probes)                           └──────────────┘
    celery)                                             │
                                                   Docker socket
                                                   (auto-discover)

Access

Configuration

All configs live in the Aether repo at monitoring/ and deploy to /opt/docker/monitoring/:

monitoring/
├── docker-compose.yml
├── prometheus/
│   ├── prometheus.yml          # scrape targets
│   └── alerts/                 # 6 rule files
├── loki/loki-config.yml
├── alloy/config.alloy
├── blackbox/blackbox.yml
└── grafana/provisioning/
    ├── datasources/
    ├── dashboards/json/        # 5 pre-built dashboards
    └── alerting/               # contact points, policies, rules

Resource Limits

Container Memory Limit
Prometheus 384 MB
Grafana 384 MB
Loki 384 MB
Alloy 384 MB
node-exporter 64 MB
cAdvisor 128 MB
postgres-exporter 32 MB
redis-exporter 32 MB
nginx-exporter 32 MB
celery-exporter 64 MB
blackbox-exporter 32 MB

Retention

  • Prometheus metrics: 15 days
  • Loki logs: 7 days