Monitoring Overview¶
Stack¶
The monitoring stack runs as Docker containers in the monitoring network on VPS #1.
| Component | Version | Role | Port |
|---|---|---|---|
| Prometheus | v3.10 | Metrics aggregation, alerting rules | 9090 |
| Grafana | v12.4 | Dashboards, unified alerting | 3000 (proxied via nginx) |
| Loki | v3.6 | Log aggregation | 3100 |
| Alloy | v1.14 | Log collector (Docker auto-discovery) | — |
| node-exporter | v1.10 | Host system metrics (CPU, memory, disk, network) | 9100 |
| cAdvisor | v0.55 | Container metrics (CPU, memory, I/O) | 8080 |
| postgres-exporter | v0.19 | PostgreSQL metrics | 9187 |
| redis-exporter | v1.82 | Redis metrics | 9121 |
| nginx-exporter | v1.4 | Nginx reverse proxy metrics | 9113 |
| celery-exporter | v0.10 | Celery task queue metrics | 9808 |
| blackbox-exporter | v0.26 | External probes (SSL, HTTP health) | 9115 |
Architecture¶
┌──────────────┐
│ Grafana │ ← dashboards + alerts
│ :3000 │
└──┬───────┬───┘
│ │
┌────────────┘ └────────────┐
▼ ▼
┌──────────────┐ ┌──────────────┐
│ Prometheus │ │ Loki │
│ :9090 │ │ :3100 │
└──────┬───────┘ └──────┬───────┘
│ │
┌───────────┼───────────┐ │
▼ ▼ ▼ ▼
exporters blackbox cAdvisor ┌──────────────┐
(node, (SSL + (container │ Alloy │
pg, redis, HTTP metrics) │ (log shim) │
nginx, probes) └──────────────┘
celery) │
Docker socket
(auto-discover)
Access¶
- Grafana UI: https://monitoring.groupe-suffren.com
- Prometheus UI: Internal only (port 9090, not exposed via nginx)
Configuration¶
All configs live in the Aether repo at monitoring/ and deploy to /opt/docker/monitoring/:
monitoring/
├── docker-compose.yml
├── prometheus/
│ ├── prometheus.yml # scrape targets
│ └── alerts/ # 6 rule files
├── loki/loki-config.yml
├── alloy/config.alloy
├── blackbox/blackbox.yml
└── grafana/provisioning/
├── datasources/
├── dashboards/json/ # 5 pre-built dashboards
└── alerting/ # contact points, policies, rules
Resource Limits¶
| Container | Memory Limit |
|---|---|
| Prometheus | 384 MB |
| Grafana | 384 MB |
| Loki | 384 MB |
| Alloy | 384 MB |
| node-exporter | 64 MB |
| cAdvisor | 128 MB |
| postgres-exporter | 32 MB |
| redis-exporter | 32 MB |
| nginx-exporter | 32 MB |
| celery-exporter | 64 MB |
| blackbox-exporter | 32 MB |
Retention¶
- Prometheus metrics: 15 days
- Loki logs: 7 days