Infrastructure Contract — Aletheia + Helios + Aether¶

Date: 2026-03-27 Status: Active Shared across: Aletheia, Helios, Aether repos

1. Context & Repos¶

Three repositories share a single infrastructure stack:

Repo	Purpose	Language	GitHub	Local path
Aletheia	Practice management backend + website CMS	Python 3.13, Django 5.2	baudry-suffren/aletheia_v2	`~/coding/aletheia/aletheia_v2/`
Helios	Multi-tenant dental practice websites	TypeScript, Next.js 16	TBD (create during B1)	`~/coding/helios/helios_test2/`
Aether	Shared infrastructure (nginx, DB, Redis, monitoring, security, backups)	Docker Compose, shell	TBD (create after first Helios deploy)	`~/coding/aether/` (future)

Current state: Aether doesn't exist yet. All infrastructure lives in aletheia_v2/infra/. It will be extracted to its own repo after Helios is first deployed (see §11).

2. Server Architecture¶

Production Server¶

IP: 54.36.99.184 (OVH VPS #1, France — GDPR compliant)

Three environments (dev/staging/prod) run on the same server, isolated by Docker Compose project names, separate databases, and separate Redis DB numbers.

Option A — All on VPS #1 (start here)¶

VPS #1 (54.36.99.184)
├── nginx (ports 80/443 — front door for everything)
│   ├── aletheia.groupe-suffren.com        → aletheia-prod-web:8000
│   ├── aletheia-staging.groupe-suffren.com → aletheia-staging-web:8000
│   ├── aletheia-dev.groupe-suffren.com    → aletheia-dev-web:8000
│   ├── cabinet-dentaire-aubagne.fr        → helios-prod-web:3000
│   ├── le-canet.chirurgiens-dentistes.fr  → helios-prod-web:3000
│   ├── staging.helios.groupe-suffren.com  → helios-staging-web:3000
│   └── dev.helios.groupe-suffren.com      → helios-dev-web:3000
│
├── Shared services
│   ├── PostgreSQL 18 (aletheia_prod/staging/dev + umami DBs)
│   └── Redis 7 (DB 0-5 Aletheia, 6-8 Helios if needed)
│
├── Aletheia containers (per env)
│   ├── aletheia-{env}-web (Gunicorn :8000)
│   ├── aletheia-{env}-celery
│   ├── aletheia-{env}-celery-beat
│   └── aletheia-{env}-celery-heavy
│
├── Helios containers (per env)
│   └── helios-{env}-web (Next.js standalone :3000)
│
├── Monitoring (Prometheus, Grafana, Loki, Alloy, exporters)
│
└── Docker networks
    ├── web       (nginx ↔ app containers)
    ├── backend   (apps ↔ PostgreSQL/Redis)
    └── monitoring

Helios → Aletheia API: Same Docker backend network. http://aletheia-{env}-web:8000/api/v1/websites/. Zero latency, zero config.

ISR webhook (Aletheia → Helios): Same Docker web network. http://helios-{env}-web:3000/api/revalidate.

Pros: Simplest. No vRack. No second server cost. One deploy pattern. One monitoring stack. Cons: Shared CPU/RAM. If traffic grows, both apps compete for resources.

Option B — Prod Helios on VPS #2, dev/staging on VPS #1¶

VPS #1 (54.36.99.184)                    VPS #2 (new, Coolify)
├── nginx                                ├── Traefik (Coolify-managed)
├── Aletheia: prod + staging + dev       └── Helios prod only
├── Helios: staging + dev                    ├── cabinet-dentaire-aubagne.fr
├── Shared services                          ├── le-canet.fr
└── Monitoring                               └── ...practice domains
         │                                        │
         └────── OVH vRack (private) ─────────────┘

Helios prod → Aletheia prod: Via vRack private IP (http://10.0.0.1/api/v1/websites/ routed through VPS #1 nginx). Helios dev/staging → Aletheia dev/staging: Docker network on VPS #1 (same as Option A).

Coolify setup: - Project: "Helios" - Environment "production": branch main, auto-deploy on push, practice domains - Traefik handles SSL for practice domains on VPS #2

Pros: Dedicated resources for production Helios. Coolify UI for prod deploys. Cons: vRack setup. Two deploy patterns (Makefile on VPS #1, Coolify on VPS #2).

Option C — Full split (backend VPS #1, frontend VPS #2)¶

VPS #1 (54.36.99.184)                    VPS #2 (new, Coolify)
├── nginx                                ├── Traefik (Coolify-managed)
├── Aletheia: prod + staging + dev       ├── Helios: prod + staging + dev
├── Shared services                      └── (no DB, no Redis)
└── Monitoring
         │                                        │
         └────── OVH vRack (private) ─────────────┘

All Helios envs → Aletheia: Via vRack. Each Helios env points to its matching Aletheia env.

Coolify setup: - Environment "production": branch main, practice domains - Environment "staging": branch staging, staging.helios.groupe-suffren.com - Environment "development": branch develop, dev.helios.groupe-suffren.com

Pros: Cleanest separation. Dedicated resources. Full Coolify for all envs. Cons: Most configuration. vRack for all 3 envs. 3 cross-server API connections + 3 ISR webhooks.

Migration Path¶

Start with Option A. Migrate to B or C when: - VPS #1 CPU consistently > 70% or RAM > 80% - You need Coolify preview environments (team grows, PR-based reviews) - A practice gets significant traffic (thousands of daily visitors)

The migration is straightforward: move Helios containers to VPS #2, update API URLs to use vRack IP, configure Coolify or Docker Compose on VPS #2.

OOM Priority Allocation (Option A)¶

Lower score = higher priority (killed last under memory pressure).

Service	OOM Score	Memory Limit
shared_postgres	-800	— (unlimited)
nginx-proxy	-600	—
shared_redis	-500	128MB
aletheia-prod-web	-300	2GB
helios-prod-web	-200	1GB
aletheia-staging-web	-100	2GB
helios-staging-web	0	512MB
aletheia-dev-web	0	1GB
helios-dev-web	100	512MB
celery workers	200-600	256MB-2GB

3. Local Development Setup¶

Developer Machine¶

IP 192.168.0.244 is the developer's Mac on the local network — not a remote server.

Running Both Apps Locally¶

Mac (localhost)
├── Docker Desktop
│   └── Aletheia containers (docker compose up)
│       ├── aletheia-local-web     → localhost:8000
│       ├── postgres               → localhost:5433
│       └── redis                  → localhost:6379
│
└── Bare metal (no Docker)
    └── Helios (npm run dev)       → localhost:3000

Steps:

Start Aletheia:

cd ~/coding/aletheia/aletheia_v2
docker compose up
# Available at http://localhost:8000

Start Helios:

cd ~/coding/helios/helios_test2   # (or wherever Next.js code lives after B1)
npm run dev
# Available at http://localhost:3000

Helios .env.local:

ALETHEIA_API_URL=http://localhost:8000/api/v1/websites
REVALIDATION_SECRET=dev-secret

Multi-Tenancy Testing¶

Next.js resolves practices from the Host header. Add to /etc/hosts:

127.0.0.1  cabinet-dentaire-aubagne.local
127.0.0.1  le-canet.local
127.0.0.1  cabinet-bodin.local
127.0.0.1  david-simon-thiais.local

Browse http://cabinet-dentaire-aubagne.local:3000/ — the proxy reads the Host, maps to a practice, fetches from Aletheia.

Testing Against Server Environments¶

To test Helios locally against staging or dev Aletheia on the server:

# .env.local — point at server instead of local
ALETHEIA_API_URL=https://aletheia-dev.groupe-suffren.com/api/v1/websites

This is useful when you want to test with real production-like data without running Aletheia locally.

ISR Webhook (Local Limitation)¶

The ISR revalidation webhook (Aletheia → Helios) does not work in local dev because: - Aletheia runs in Docker and can't reach localhost:3000 on the Mac host easily - The webhook URL would need to be http://host.docker.internal:3000/api/revalidate (Docker Desktop for Mac)

Workaround: Manually trigger revalidation after content changes:

curl -X POST http://localhost:3000/api/revalidate \
  -H "Content-Type: application/json" \
  -d '{"secret":"dev-secret","tags":["page:implant-dentaire-aubagne"]}'

Or simply restart npm run dev (clears all cache).

Media Files¶

Aletheia returns absolute media URLs in API responses (e.g., http://localhost:8000/media/...). These work directly in local dev because Aletheia serves media on the same host. In production, URLs point to the Cloudflare CDN domain.

Local Dev Is Identical Across Server Options¶

Regardless of whether the server uses Option A, B, or C — local development is always the same: both apps on your Mac, Helios at :3000, Aletheia at :8000, connected via localhost.

4. Shared Services¶

PostgreSQL 18¶

Single container, multiple databases. Port 127.0.0.1:5432 (localhost only on server, not exposed to network).

Database	User	Environment
`aletheia_prod`	`aletheia_prod`	Aletheia production
`aletheia_staging`	`aletheia_staging`	Aletheia staging
`aletheia_dev`	`aletheia_dev`	Aletheia development
`umami`	`umami`	Umami Analytics (shared)

Helios does not have its own database. It reads data from Aletheia's API.

Redis 7¶

Single container, --maxmemory 128mb, allkeys-lru eviction.

DB	Use
0	Aletheia prod — Celery broker + result
1	Aletheia prod — cache
2	Aletheia staging — Celery broker + result
3	Aletheia staging — cache
4	Aletheia dev — Celery broker + result
5	Aletheia dev — cache
6	Helios prod — ISR cache (if needed, otherwise filesystem)
7	Helios staging — ISR cache
8	Helios dev — ISR cache
9-15	Reserved

If Helios uses Redis for ISR caching, consider increasing maxmemory to 192mb or 256mb.

5. Docker Compose Structure¶

Pattern (Same for Both Apps)¶

app/
├── docker-compose.yml              # Base: service definitions, networks
├── docker-compose.prod.yml         # Prod: OOM scores, memory limits, restart: always
├── docker-compose.staging.yml      # Staging: lower limits, restart: unless-stopped
├── docker-compose.dev.yml          # Dev-on-server: development target, lowest limits
├── docker-compose.override.yml     # Local dev: DB/Redis, source mounting, hot reload
├── Dockerfile                      # Multi-stage build
└── deploy.sh                       # Deploy script (git fetch, build, restart)

Helios-Specific¶

# docker-compose.yml (base)
services:
  web:
    build:
      context: .
      target: production
    container_name: helios-${ENVIRONMENT:-local}-web
    env_file:
      - path: ${ENV_FILE:-.env.local}
        required: false
    networks:
      - web
      - backend    # needed for direct API calls to Aletheia (Option A)
    expose:
      - "3000"
    volumes:
      - nextjs-cache:/app/.next/cache    # ISR file cache persistence

networks:
  web:
    external: true
  backend:
    external: true

volumes:
  nextjs-cache:

Key differences from Aletheia: - One service (no Celery, no beat, no heavy worker) - Port 3000 (not 8000) - No database connection (talks to Aletheia via HTTP) - Cache volume for ISR persistence across deploys

Helios Dockerfile (Sketch)¶

# Stage 1: Dependencies
FROM node:22-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci

# Stage 2: Build
FROM node:22-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build

# Stage 3: Production
FROM node:22-alpine AS production
WORKDIR /app
RUN addgroup -S nextjs && adduser -S nextjs -G nextjs
COPY --from=builder --chown=nextjs:nextjs /app/.next/standalone ./
COPY --from=builder --chown=nextjs:nextjs /app/.next/static ./.next/static
COPY --from=builder --chown=nextjs:nextjs /app/public ./public
USER nextjs
EXPOSE 3000
ENV PORT=3000 HOSTNAME="0.0.0.0"
CMD ["node", "server.js"]

Uses Next.js standalone output mode — minimal production image with only the needed files.

6. Nginx Routing¶

Domain Mapping¶

Practice codes: cda (Aubagne), vsm (Le Canet), pds (Bodin), ths (David Simon)

Domain	Routes to	Environment
`aletheia.groupe-suffren.com`	`aletheia-prod-web:8000`	Aletheia prod
`aletheia-staging.groupe-suffren.com`	`aletheia-staging-web:8000`	Aletheia staging
`aletheia-dev.groupe-suffren.com`	`aletheia-dev-web:8000`	Aletheia dev
`cabinet-dentaire-aubagne.fr`	`helios-prod-web:3000`	Helios prod (real domain)
(+ all practice prod domains)	`helios-prod-web:3000`	Helios prod
`cda.groupe-suffren.com`	`helios-prod-web:3000`	Helios prod preview (Aubagne)
`cda-staging.groupe-suffren.com`	`helios-staging-web:3000`	Helios staging (Aubagne)
`cda-dev.groupe-suffren.com`	`helios-dev-web:3000`	Helios dev (Aubagne)
`vsm.groupe-suffren.com`	`helios-prod-web:3000`	Helios prod preview (Le Canet)
`vsm-staging.groupe-suffren.com`	`helios-staging-web:3000`	Helios staging (Le Canet)
`vsm-dev.groupe-suffren.com`	`helios-dev-web:3000`	Helios dev (Le Canet)
(same pattern for pds, ths)
`monitoring.groupe-suffren.com`	`grafana:3000`	Monitoring

Helios Nginx Vhost (prod example)¶

server {
    listen 443 ssl;
    server_name
        cabinet-dentaire-aubagne.fr
        www.cabinet-dentaire-aubagne.fr
        le-canet.chirurgiens-dentistes.fr
        cabinet-bodin.fr
        dr-david-simon.chirurgiens-dentistes.fr
        ;
        # Add new practice domains here

    ssl_certificate     /etc/letsencrypt/live/cabinet-dentaire-aubagne.fr/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/cabinet-dentaire-aubagne.fr/privkey.pem;
    # OR: Cloudflare origin cert (single cert for all domains behind Cloudflare)

    resolver 127.0.0.11 valid=30s;

    # Next.js static assets (immutable, long cache)
    location /_next/static/ {
        set $upstream http://helios-prod-web:3000;
        proxy_pass $upstream;
        expires 365d;
        add_header Cache-Control "public, immutable";
    }

    # All requests → Next.js
    location / {
        limit_req zone=general burst=20 nodelay;
        set $upstream http://helios-prod-web:3000;
        proxy_pass $upstream;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_redirect off;
    }
}

Important: The Host header is passed through (proxy_set_header Host $host). Next.js uses this to resolve which practice to render.

SSL Strategy¶

Recommended: Cloudflare (already in the spec). - All practice domains point DNS to Cloudflare - Cloudflare handles SSL termination at the edge - Nginx uses a Cloudflare origin certificate (one cert covers all domains) - Simplifies cert management vs. individual Let's Encrypt certs per domain

7. Coolify Setup (Options B/C — Future)¶

When migrating to a second VPS with Coolify:

Coolify Project Structure¶

Coolify Dashboard (VPS #2)
└── Project: "Helios"
    ├── Environment: production
    │   ├── Source: GitHub baudry-suffren/helios, branch main
    │   ├── Build: Dockerfile (standalone)
    │   ├── Domains: cabinet-dentaire-aubagne.fr, le-canet.fr, ...
    │   ├── Auto-deploy: on push to main
    │   └── Env vars:
    │       ALETHEIA_API_URL=http://10.0.0.1:80/api/v1/websites  # vRack → VPS #1 nginx
    │       REVALIDATION_SECRET=<secret>
    │
    ├── Environment: staging
    │   ├── Source: branch staging
    │   ├── Domain: staging.helios.groupe-suffren.com
    │   └── Env vars: ALETHEIA_API_URL=http://10.0.0.1:80/api-staging/...
    │
    └── Environment: development
        ├── Source: branch develop
        ├── Domain: dev.helios.groupe-suffren.com
        └── Env vars: ALETHEIA_API_URL=http://10.0.0.1:80/api-dev/...

vRack Configuration¶

OVH vRack creates a private network between VPS #1 and VPS #2: - VPS #1 (Aletheia): 10.0.0.1 - VPS #2 (Helios): 10.0.0.2 - Traffic between them is private, fast, and free

VPS #1 nginx needs additional server blocks (or location blocks) to expose each Aletheia environment's API on the vRack interface for Helios to reach.

ISR Webhook (Aletheia → Helios, cross-server)¶

Aletheia sends webhook to Helios via vRack:

HELIOS_REVALIDATION_URL=http://10.0.0.2:3000/api/revalidate  # prod

Coolify/Traefik on VPS #2 routes this to the correct Helios container.

8. Deploy & Secrets¶

Option A (Makefile Deploy — Same Pattern for Both Apps)¶

Aletheia (existing):

make deploy ENV=prod REF=main
# → deploy.sh: git fetch, build, migrate, collectstatic, restart, nginx swap

Helios (new, simpler — no migrations, no collectstatic):

make deploy ENV=prod REF=main
# → deploy.sh: git fetch, build (includes next build in Docker), restart, nginx swap

Secret Management (SOPS + age)¶

Same age keypair across all repos. Key stored at /opt/docker/.age-key.txt on server.

Repo	Encrypted files
Aletheia	`infra/envs/.env.{dev,staging,prod}.enc`
Helios	`infra/envs/.env.{dev,staging,prod}.enc`
Aether (future)	`shared/.env.enc`, `nginx/.htpasswd.enc`, `init-scripts/*.sql.enc`

Server Directory Layout¶

/opt/docker/
├── nginx/                    # Reverse proxy (all vhosts)
├── shared/                   # PostgreSQL + Redis
├── aletheia/
│   ├── repo/                 # Git clone
│   ├── envs/                 # .env.dev, .env.staging, .env.prod
│   └── media/                # User uploads (dev/, staging/, prod/)
├── helios/
│   ├── repo/                 # Git clone
│   └── envs/                 # .env.dev, .env.staging, .env.prod
├── monitoring/               # Full monitoring stack
└── backups/
    ├── scripts/backup.sh
    ├── daily/
    └── weekly/

9. Monitoring¶

Automatic (Zero Config for New Apps)¶

Alloy discovers all Docker containers automatically via discovery.docker. Helios containers appear with labels container=helios-prod-web, compose_service=web. Logs flow to Loki automatically.
Docker exporter tracks CPU/memory/network for all containers. Helios appears in existing Grafana Docker dashboard automatically.

Manual Additions (When Helios Deploys)¶

Config file	Change
`prometheus/prometheus.yml`	Add Helios practice domains to the `blackbox-helios-frontend` (frontend reachability) and `blackbox-ssl` (cert expiry) scrape jobs — not `blackbox-health`, which is Aletheia-backend-only
`prometheus/alerts/helios.rules.yml`	New alert rules: health check failing, container restarts
`grafana/provisioning/dashboards/`	Optional: Helios-specific dashboard (if Next.js exports Prometheus metrics)

10. Backups¶

Current¶

Daily at 2 AM: pg_dump of aletheia_prod + aletheia_staging, tar.gz of Aletheia media. 7-day daily + 28-day weekly retention.

Changes for Helios¶

Add Umami database to backup loop:

# In backup.sh, extend the database list:
for DB in aletheia_prod aletheia_staging umami; do

Helios has nothing to back up: - No database (reads from Aletheia API) - No user uploads (media is in Aletheia) - Code is in git - ISR cache is ephemeral (regenerated from API data)

11. Aether Extraction Roadmap¶

Phase 1 — Now (Infra Stays in Aletheia Repo)¶

All infrastructure configs remain at ~/coding/aletheia/aletheia_v2/infra/. Helios-specific additions (nginx vhosts, env templates) are added there. This is pragmatic: - The Makefile, setup.sh, and SOPS workflow all work today - No refactoring needed before Helios has code

Phase 2 — After First Helios Deploy (Create Aether Repo)¶

Create baudry-suffren/aether repo. Extract from aletheia_v2/infra/:

Goes to Aether	Stays in Aletheia	Goes in Helios
`nginx/` (all vhosts, nginx.conf)	`docker-compose.yml` + overrides	`docker-compose.yml` + overrides
`shared/` (postgres, redis, init scripts)	`Dockerfile`	`Dockerfile`
`monitoring/` (full stack)	`deploy.sh`	`deploy.sh`
`backups/`	`Makefile` (app targets)	`Makefile` (app targets)
`security/` (SSH, firewall, fail2ban, sysctl)	`infra/envs/` (Aletheia env files)	`infra/envs/` (Helios env files)
`setup.sh`
`.sops.yaml`
`Makefile` (infra targets: infra-diff, infra-deploy, infra-pull, encrypt, decrypt)

On the server, Aether clones to /opt/docker/aether-repo/. The infra Makefile runs from there.

Phase 2 Trigger¶

Create Aether when you find yourself editing nginx configs or monitoring rules from the Aletheia repo and it feels wrong. Don't extract proactively — wait until the coupling is actually annoying.

12. Changelog¶

Track infrastructure changes that affect either app. Format: date, what changed, why, which apps affected.

(No changes yet — document created 2026-03-27)