Infrastructure Contract — Aletheia + Helios + Aether¶
Date: 2026-03-27 Status: Active Shared across: Aletheia, Helios, Aether repos
1. Context & Repos¶
Three repositories share a single infrastructure stack:
| Repo | Purpose | Language | GitHub | Local path |
|---|---|---|---|---|
| Aletheia | Practice management backend + website CMS | Python 3.13, Django 5.2 | baudry-suffren/aletheia_v2 | ~/coding/aletheia/aletheia_v2/ |
| Helios | Multi-tenant dental practice websites | TypeScript, Next.js 16 | TBD (create during B1) | ~/coding/helios/helios_test2/ |
| Aether | Shared infrastructure (nginx, DB, Redis, monitoring, security, backups) | Docker Compose, shell | TBD (create after first Helios deploy) | ~/coding/aether/ (future) |
Current state: Aether doesn't exist yet. All infrastructure lives in aletheia_v2/infra/. It will be extracted to its own repo after Helios is first deployed (see §11).
2. Server Architecture¶
Production Server¶
IP: 54.36.99.184 (OVH VPS #1, France — GDPR compliant)
Three environments (dev/staging/prod) run on the same server, isolated by Docker Compose project names, separate databases, and separate Redis DB numbers.
Option A — All on VPS #1 (start here)¶
VPS #1 (54.36.99.184)
├── nginx (ports 80/443 — front door for everything)
│ ├── aletheia.groupe-suffren.com → aletheia-prod-web:8000
│ ├── aletheia-staging.groupe-suffren.com → aletheia-staging-web:8000
│ ├── aletheia-dev.groupe-suffren.com → aletheia-dev-web:8000
│ ├── cabinet-dentaire-aubagne.fr → helios-prod-web:3000
│ ├── le-canet.chirurgiens-dentistes.fr → helios-prod-web:3000
│ ├── staging.helios.groupe-suffren.com → helios-staging-web:3000
│ └── dev.helios.groupe-suffren.com → helios-dev-web:3000
│
├── Shared services
│ ├── PostgreSQL 18 (aletheia_prod/staging/dev + umami DBs)
│ └── Redis 7 (DB 0-5 Aletheia, 6-8 Helios if needed)
│
├── Aletheia containers (per env)
│ ├── aletheia-{env}-web (Gunicorn :8000)
│ ├── aletheia-{env}-celery
│ ├── aletheia-{env}-celery-beat
│ └── aletheia-{env}-celery-heavy
│
├── Helios containers (per env)
│ └── helios-{env}-web (Next.js standalone :3000)
│
├── Monitoring (Prometheus, Grafana, Loki, Alloy, exporters)
│
└── Docker networks
├── web (nginx ↔ app containers)
├── backend (apps ↔ PostgreSQL/Redis)
└── monitoring
Helios → Aletheia API: Same Docker backend network. http://aletheia-{env}-web:8000/api/v1/websites/. Zero latency, zero config.
ISR webhook (Aletheia → Helios): Same Docker web network. http://helios-{env}-web:3000/api/revalidate.
Pros: Simplest. No vRack. No second server cost. One deploy pattern. One monitoring stack. Cons: Shared CPU/RAM. If traffic grows, both apps compete for resources.
Option B — Prod Helios on VPS #2, dev/staging on VPS #1¶
VPS #1 (54.36.99.184) VPS #2 (new, Coolify)
├── nginx ├── Traefik (Coolify-managed)
├── Aletheia: prod + staging + dev └── Helios prod only
├── Helios: staging + dev ├── cabinet-dentaire-aubagne.fr
├── Shared services ├── le-canet.fr
└── Monitoring └── ...practice domains
│ │
└────── OVH vRack (private) ─────────────┘
Helios prod → Aletheia prod: Via vRack private IP (http://10.0.0.1/api/v1/websites/ routed through VPS #1 nginx).
Helios dev/staging → Aletheia dev/staging: Docker network on VPS #1 (same as Option A).
Coolify setup:
- Project: "Helios"
- Environment "production": branch main, auto-deploy on push, practice domains
- Traefik handles SSL for practice domains on VPS #2
Pros: Dedicated resources for production Helios. Coolify UI for prod deploys. Cons: vRack setup. Two deploy patterns (Makefile on VPS #1, Coolify on VPS #2).
Option C — Full split (backend VPS #1, frontend VPS #2)¶
VPS #1 (54.36.99.184) VPS #2 (new, Coolify)
├── nginx ├── Traefik (Coolify-managed)
├── Aletheia: prod + staging + dev ├── Helios: prod + staging + dev
├── Shared services └── (no DB, no Redis)
└── Monitoring
│ │
└────── OVH vRack (private) ─────────────┘
All Helios envs → Aletheia: Via vRack. Each Helios env points to its matching Aletheia env.
Coolify setup:
- Environment "production": branch main, practice domains
- Environment "staging": branch staging, staging.helios.groupe-suffren.com
- Environment "development": branch develop, dev.helios.groupe-suffren.com
Pros: Cleanest separation. Dedicated resources. Full Coolify for all envs. Cons: Most configuration. vRack for all 3 envs. 3 cross-server API connections + 3 ISR webhooks.
Migration Path¶
Start with Option A. Migrate to B or C when: - VPS #1 CPU consistently > 70% or RAM > 80% - You need Coolify preview environments (team grows, PR-based reviews) - A practice gets significant traffic (thousands of daily visitors)
The migration is straightforward: move Helios containers to VPS #2, update API URLs to use vRack IP, configure Coolify or Docker Compose on VPS #2.
OOM Priority Allocation (Option A)¶
Lower score = higher priority (killed last under memory pressure).
| Service | OOM Score | Memory Limit |
|---|---|---|
| shared_postgres | -800 | — (unlimited) |
| nginx-proxy | -600 | — |
| shared_redis | -500 | 128MB |
| aletheia-prod-web | -300 | 2GB |
| helios-prod-web | -200 | 1GB |
| aletheia-staging-web | -100 | 2GB |
| helios-staging-web | 0 | 512MB |
| aletheia-dev-web | 0 | 1GB |
| helios-dev-web | 100 | 512MB |
| celery workers | 200-600 | 256MB-2GB |
3. Local Development Setup¶
Developer Machine¶
IP 192.168.0.244 is the developer's Mac on the local network — not a remote server.
Running Both Apps Locally¶
Mac (localhost)
├── Docker Desktop
│ └── Aletheia containers (docker compose up)
│ ├── aletheia-local-web → localhost:8000
│ ├── postgres → localhost:5433
│ └── redis → localhost:6379
│
└── Bare metal (no Docker)
└── Helios (npm run dev) → localhost:3000
Steps:
-
Start Aletheia:
-
Start Helios:
-
Helios
.env.local:
Multi-Tenancy Testing¶
Next.js resolves practices from the Host header. Add to /etc/hosts:
127.0.0.1 cabinet-dentaire-aubagne.local
127.0.0.1 le-canet.local
127.0.0.1 cabinet-bodin.local
127.0.0.1 david-simon-thiais.local
Browse http://cabinet-dentaire-aubagne.local:3000/ — the proxy reads the Host, maps to a practice, fetches from Aletheia.
Testing Against Server Environments¶
To test Helios locally against staging or dev Aletheia on the server:
# .env.local — point at server instead of local
ALETHEIA_API_URL=https://aletheia-dev.groupe-suffren.com/api/v1/websites
This is useful when you want to test with real production-like data without running Aletheia locally.
ISR Webhook (Local Limitation)¶
The ISR revalidation webhook (Aletheia → Helios) does not work in local dev because:
- Aletheia runs in Docker and can't reach localhost:3000 on the Mac host easily
- The webhook URL would need to be http://host.docker.internal:3000/api/revalidate (Docker Desktop for Mac)
Workaround: Manually trigger revalidation after content changes:
curl -X POST http://localhost:3000/api/revalidate \
-H "Content-Type: application/json" \
-d '{"secret":"dev-secret","tags":["page:implant-dentaire-aubagne"]}'
Or simply restart npm run dev (clears all cache).
Media Files¶
Aletheia returns absolute media URLs in API responses (e.g., http://localhost:8000/media/...). These work directly in local dev because Aletheia serves media on the same host. In production, URLs point to the Cloudflare CDN domain.
Local Dev Is Identical Across Server Options¶
Regardless of whether the server uses Option A, B, or C — local development is always the same: both apps on your Mac, Helios at :3000, Aletheia at :8000, connected via localhost.
4. Shared Services¶
PostgreSQL 18¶
Single container, multiple databases. Port 127.0.0.1:5432 (localhost only on server, not exposed to network).
| Database | User | Environment |
|---|---|---|
aletheia_prod |
aletheia_prod |
Aletheia production |
aletheia_staging |
aletheia_staging |
Aletheia staging |
aletheia_dev |
aletheia_dev |
Aletheia development |
umami |
umami |
Umami Analytics (shared) |
Helios does not have its own database. It reads data from Aletheia's API.
Redis 7¶
Single container, --maxmemory 128mb, allkeys-lru eviction.
| DB | Use |
|---|---|
| 0 | Aletheia prod — Celery broker + result |
| 1 | Aletheia prod — cache |
| 2 | Aletheia staging — Celery broker + result |
| 3 | Aletheia staging — cache |
| 4 | Aletheia dev — Celery broker + result |
| 5 | Aletheia dev — cache |
| 6 | Helios prod — ISR cache (if needed, otherwise filesystem) |
| 7 | Helios staging — ISR cache |
| 8 | Helios dev — ISR cache |
| 9-15 | Reserved |
If Helios uses Redis for ISR caching, consider increasing maxmemory to 192mb or 256mb.
5. Docker Compose Structure¶
Pattern (Same for Both Apps)¶
app/
├── docker-compose.yml # Base: service definitions, networks
├── docker-compose.prod.yml # Prod: OOM scores, memory limits, restart: always
├── docker-compose.staging.yml # Staging: lower limits, restart: unless-stopped
├── docker-compose.dev.yml # Dev-on-server: development target, lowest limits
├── docker-compose.override.yml # Local dev: DB/Redis, source mounting, hot reload
├── Dockerfile # Multi-stage build
└── deploy.sh # Deploy script (git fetch, build, restart)
Helios-Specific¶
# docker-compose.yml (base)
services:
web:
build:
context: .
target: production
container_name: helios-${ENVIRONMENT:-local}-web
env_file:
- path: ${ENV_FILE:-.env.local}
required: false
networks:
- web
- backend # needed for direct API calls to Aletheia (Option A)
expose:
- "3000"
volumes:
- nextjs-cache:/app/.next/cache # ISR file cache persistence
networks:
web:
external: true
backend:
external: true
volumes:
nextjs-cache:
Key differences from Aletheia: - One service (no Celery, no beat, no heavy worker) - Port 3000 (not 8000) - No database connection (talks to Aletheia via HTTP) - Cache volume for ISR persistence across deploys
Helios Dockerfile (Sketch)¶
# Stage 1: Dependencies
FROM node:22-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
# Stage 2: Build
FROM node:22-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build
# Stage 3: Production
FROM node:22-alpine AS production
WORKDIR /app
RUN addgroup -S nextjs && adduser -S nextjs -G nextjs
COPY --from=builder --chown=nextjs:nextjs /app/.next/standalone ./
COPY --from=builder --chown=nextjs:nextjs /app/.next/static ./.next/static
COPY --from=builder --chown=nextjs:nextjs /app/public ./public
USER nextjs
EXPOSE 3000
ENV PORT=3000 HOSTNAME="0.0.0.0"
CMD ["node", "server.js"]
Uses Next.js standalone output mode — minimal production image with only the needed files.
6. Nginx Routing¶
Domain Mapping¶
Practice codes: cda (Aubagne), vsm (Le Canet), pds (Bodin), ths (David Simon)
| Domain | Routes to | Environment |
|---|---|---|
aletheia.groupe-suffren.com |
aletheia-prod-web:8000 |
Aletheia prod |
aletheia-staging.groupe-suffren.com |
aletheia-staging-web:8000 |
Aletheia staging |
aletheia-dev.groupe-suffren.com |
aletheia-dev-web:8000 |
Aletheia dev |
cabinet-dentaire-aubagne.fr |
helios-prod-web:3000 |
Helios prod (real domain) |
| (+ all practice prod domains) | helios-prod-web:3000 |
Helios prod |
cda.groupe-suffren.com |
helios-prod-web:3000 |
Helios prod preview (Aubagne) |
cda-staging.groupe-suffren.com |
helios-staging-web:3000 |
Helios staging (Aubagne) |
cda-dev.groupe-suffren.com |
helios-dev-web:3000 |
Helios dev (Aubagne) |
vsm.groupe-suffren.com |
helios-prod-web:3000 |
Helios prod preview (Le Canet) |
vsm-staging.groupe-suffren.com |
helios-staging-web:3000 |
Helios staging (Le Canet) |
vsm-dev.groupe-suffren.com |
helios-dev-web:3000 |
Helios dev (Le Canet) |
| (same pattern for pds, ths) | ||
monitoring.groupe-suffren.com |
grafana:3000 |
Monitoring |
Helios Nginx Vhost (prod example)¶
server {
listen 443 ssl;
server_name
cabinet-dentaire-aubagne.fr
www.cabinet-dentaire-aubagne.fr
le-canet.chirurgiens-dentistes.fr
cabinet-bodin.fr
dr-david-simon.chirurgiens-dentistes.fr
;
# Add new practice domains here
ssl_certificate /etc/letsencrypt/live/cabinet-dentaire-aubagne.fr/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/cabinet-dentaire-aubagne.fr/privkey.pem;
# OR: Cloudflare origin cert (single cert for all domains behind Cloudflare)
resolver 127.0.0.11 valid=30s;
# Next.js static assets (immutable, long cache)
location /_next/static/ {
set $upstream http://helios-prod-web:3000;
proxy_pass $upstream;
expires 365d;
add_header Cache-Control "public, immutable";
}
# All requests → Next.js
location / {
limit_req zone=general burst=20 nodelay;
set $upstream http://helios-prod-web:3000;
proxy_pass $upstream;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_redirect off;
}
}
Important: The Host header is passed through (proxy_set_header Host $host). Next.js uses this to resolve which practice to render.
SSL Strategy¶
Recommended: Cloudflare (already in the spec). - All practice domains point DNS to Cloudflare - Cloudflare handles SSL termination at the edge - Nginx uses a Cloudflare origin certificate (one cert covers all domains) - Simplifies cert management vs. individual Let's Encrypt certs per domain
7. Coolify Setup (Options B/C — Future)¶
When migrating to a second VPS with Coolify:
Coolify Project Structure¶
Coolify Dashboard (VPS #2)
└── Project: "Helios"
├── Environment: production
│ ├── Source: GitHub baudry-suffren/helios, branch main
│ ├── Build: Dockerfile (standalone)
│ ├── Domains: cabinet-dentaire-aubagne.fr, le-canet.fr, ...
│ ├── Auto-deploy: on push to main
│ └── Env vars:
│ ALETHEIA_API_URL=http://10.0.0.1:80/api/v1/websites # vRack → VPS #1 nginx
│ REVALIDATION_SECRET=<secret>
│
├── Environment: staging
│ ├── Source: branch staging
│ ├── Domain: staging.helios.groupe-suffren.com
│ └── Env vars: ALETHEIA_API_URL=http://10.0.0.1:80/api-staging/...
│
└── Environment: development
├── Source: branch develop
├── Domain: dev.helios.groupe-suffren.com
└── Env vars: ALETHEIA_API_URL=http://10.0.0.1:80/api-dev/...
vRack Configuration¶
OVH vRack creates a private network between VPS #1 and VPS #2:
- VPS #1 (Aletheia): 10.0.0.1
- VPS #2 (Helios): 10.0.0.2
- Traffic between them is private, fast, and free
VPS #1 nginx needs additional server blocks (or location blocks) to expose each Aletheia environment's API on the vRack interface for Helios to reach.
ISR Webhook (Aletheia → Helios, cross-server)¶
Aletheia sends webhook to Helios via vRack:
Coolify/Traefik on VPS #2 routes this to the correct Helios container.8. Deploy & Secrets¶
Option A (Makefile Deploy — Same Pattern for Both Apps)¶
Aletheia (existing):
make deploy ENV=prod REF=main
# → deploy.sh: git fetch, build, migrate, collectstatic, restart, nginx swap
Helios (new, simpler — no migrations, no collectstatic):
make deploy ENV=prod REF=main
# → deploy.sh: git fetch, build (includes next build in Docker), restart, nginx swap
Secret Management (SOPS + age)¶
Same age keypair across all repos. Key stored at /opt/docker/.age-key.txt on server.
| Repo | Encrypted files |
|---|---|
| Aletheia | infra/envs/.env.{dev,staging,prod}.enc |
| Helios | infra/envs/.env.{dev,staging,prod}.enc |
| Aether (future) | shared/.env.enc, nginx/.htpasswd.enc, init-scripts/*.sql.enc |
Server Directory Layout¶
/opt/docker/
├── nginx/ # Reverse proxy (all vhosts)
├── shared/ # PostgreSQL + Redis
├── aletheia/
│ ├── repo/ # Git clone
│ ├── envs/ # .env.dev, .env.staging, .env.prod
│ └── media/ # User uploads (dev/, staging/, prod/)
├── helios/
│ ├── repo/ # Git clone
│ └── envs/ # .env.dev, .env.staging, .env.prod
├── monitoring/ # Full monitoring stack
└── backups/
├── scripts/backup.sh
├── daily/
└── weekly/
9. Monitoring¶
Automatic (Zero Config for New Apps)¶
- Alloy discovers all Docker containers automatically via
discovery.docker. Helios containers appear with labelscontainer=helios-prod-web,compose_service=web. Logs flow to Loki automatically. - Docker exporter tracks CPU/memory/network for all containers. Helios appears in existing Grafana Docker dashboard automatically.
Manual Additions (When Helios Deploys)¶
| Config file | Change |
|---|---|
prometheus/prometheus.yml |
Add Helios practice domains to blackbox-health and blackbox-ssl scrape targets |
prometheus/alerts/helios.rules.yml |
New alert rules: health check failing, container restarts |
grafana/provisioning/dashboards/ |
Optional: Helios-specific dashboard (if Next.js exports Prometheus metrics) |
10. Backups¶
Current¶
Daily at 2 AM: pg_dump of aletheia_prod + aletheia_staging, tar.gz of Aletheia media. 7-day daily + 28-day weekly retention.
Changes for Helios¶
Add Umami database to backup loop:
Helios has nothing to back up: - No database (reads from Aletheia API) - No user uploads (media is in Aletheia) - Code is in git - ISR cache is ephemeral (regenerated from API data)
11. Aether Extraction Roadmap¶
Phase 1 — Now (Infra Stays in Aletheia Repo)¶
All infrastructure configs remain at ~/coding/aletheia/aletheia_v2/infra/. Helios-specific additions (nginx vhosts, env templates) are added there. This is pragmatic:
- The Makefile, setup.sh, and SOPS workflow all work today
- No refactoring needed before Helios has code
Phase 2 — After First Helios Deploy (Create Aether Repo)¶
Create baudry-suffren/aether repo. Extract from aletheia_v2/infra/:
| Goes to Aether | Stays in Aletheia | Goes in Helios |
|---|---|---|
nginx/ (all vhosts, nginx.conf) |
docker-compose.yml + overrides |
docker-compose.yml + overrides |
shared/ (postgres, redis, init scripts) |
Dockerfile |
Dockerfile |
monitoring/ (full stack) |
deploy.sh |
deploy.sh |
backups/ |
Makefile (app targets) |
Makefile (app targets) |
security/ (SSH, firewall, fail2ban, sysctl) |
infra/envs/ (Aletheia env files) |
infra/envs/ (Helios env files) |
setup.sh |
||
.sops.yaml |
||
Makefile (infra targets: infra-diff, infra-deploy, infra-pull, encrypt, decrypt) |
On the server, Aether clones to /opt/docker/aether-repo/. The infra Makefile runs from there.
Phase 2 Trigger¶
Create Aether when you find yourself editing nginx configs or monitoring rules from the Aletheia repo and it feels wrong. Don't extract proactively — wait until the coupling is actually annoying.
12. Changelog¶
Track infrastructure changes that affect either app. Format: date, what changed, why, which apps affected.