Aller au contenu

Infrastructure Contract — Aletheia + Helios + Aether

Date: 2026-03-27 Status: Active Shared across: Aletheia, Helios, Aether repos


1. Context & Repos

Three repositories share a single infrastructure stack:

Repo Purpose Language GitHub Local path
Aletheia Practice management backend + website CMS Python 3.13, Django 5.2 baudry-suffren/aletheia_v2 ~/coding/aletheia/aletheia_v2/
Helios Multi-tenant dental practice websites TypeScript, Next.js 16 TBD (create during B1) ~/coding/helios/helios_test2/
Aether Shared infrastructure (nginx, DB, Redis, monitoring, security, backups) Docker Compose, shell TBD (create after first Helios deploy) ~/coding/aether/ (future)

Current state: Aether doesn't exist yet. All infrastructure lives in aletheia_v2/infra/. It will be extracted to its own repo after Helios is first deployed (see §11).


2. Server Architecture

Production Server

IP: 54.36.99.184 (OVH VPS #1, France — GDPR compliant)

Three environments (dev/staging/prod) run on the same server, isolated by Docker Compose project names, separate databases, and separate Redis DB numbers.

Option A — All on VPS #1 (start here)

VPS #1 (54.36.99.184)
├── nginx (ports 80/443 — front door for everything)
│   ├── aletheia.groupe-suffren.com        → aletheia-prod-web:8000
│   ├── aletheia-staging.groupe-suffren.com → aletheia-staging-web:8000
│   ├── aletheia-dev.groupe-suffren.com    → aletheia-dev-web:8000
│   ├── cabinet-dentaire-aubagne.fr        → helios-prod-web:3000
│   ├── le-canet.chirurgiens-dentistes.fr  → helios-prod-web:3000
│   ├── staging.helios.groupe-suffren.com  → helios-staging-web:3000
│   └── dev.helios.groupe-suffren.com      → helios-dev-web:3000
├── Shared services
│   ├── PostgreSQL 18 (aletheia_prod/staging/dev + umami DBs)
│   └── Redis 7 (DB 0-5 Aletheia, 6-8 Helios if needed)
├── Aletheia containers (per env)
│   ├── aletheia-{env}-web (Gunicorn :8000)
│   ├── aletheia-{env}-celery
│   ├── aletheia-{env}-celery-beat
│   └── aletheia-{env}-celery-heavy
├── Helios containers (per env)
│   └── helios-{env}-web (Next.js standalone :3000)
├── Monitoring (Prometheus, Grafana, Loki, Alloy, exporters)
└── Docker networks
    ├── web       (nginx ↔ app containers)
    ├── backend   (apps ↔ PostgreSQL/Redis)
    └── monitoring

Helios → Aletheia API: Same Docker backend network. http://aletheia-{env}-web:8000/api/v1/websites/. Zero latency, zero config.

ISR webhook (Aletheia → Helios): Same Docker web network. http://helios-{env}-web:3000/api/revalidate.

Pros: Simplest. No vRack. No second server cost. One deploy pattern. One monitoring stack. Cons: Shared CPU/RAM. If traffic grows, both apps compete for resources.

Option B — Prod Helios on VPS #2, dev/staging on VPS #1

VPS #1 (54.36.99.184)                    VPS #2 (new, Coolify)
├── nginx                                ├── Traefik (Coolify-managed)
├── Aletheia: prod + staging + dev       └── Helios prod only
├── Helios: staging + dev                    ├── cabinet-dentaire-aubagne.fr
├── Shared services                          ├── le-canet.fr
└── Monitoring                               └── ...practice domains
         │                                        │
         └────── OVH vRack (private) ─────────────┘

Helios prod → Aletheia prod: Via vRack private IP (http://10.0.0.1/api/v1/websites/ routed through VPS #1 nginx). Helios dev/staging → Aletheia dev/staging: Docker network on VPS #1 (same as Option A).

Coolify setup: - Project: "Helios" - Environment "production": branch main, auto-deploy on push, practice domains - Traefik handles SSL for practice domains on VPS #2

Pros: Dedicated resources for production Helios. Coolify UI for prod deploys. Cons: vRack setup. Two deploy patterns (Makefile on VPS #1, Coolify on VPS #2).

Option C — Full split (backend VPS #1, frontend VPS #2)

VPS #1 (54.36.99.184)                    VPS #2 (new, Coolify)
├── nginx                                ├── Traefik (Coolify-managed)
├── Aletheia: prod + staging + dev       ├── Helios: prod + staging + dev
├── Shared services                      └── (no DB, no Redis)
└── Monitoring
         │                                        │
         └────── OVH vRack (private) ─────────────┘

All Helios envs → Aletheia: Via vRack. Each Helios env points to its matching Aletheia env.

Coolify setup: - Environment "production": branch main, practice domains - Environment "staging": branch staging, staging.helios.groupe-suffren.com - Environment "development": branch develop, dev.helios.groupe-suffren.com

Pros: Cleanest separation. Dedicated resources. Full Coolify for all envs. Cons: Most configuration. vRack for all 3 envs. 3 cross-server API connections + 3 ISR webhooks.

Migration Path

Start with Option A. Migrate to B or C when: - VPS #1 CPU consistently > 70% or RAM > 80% - You need Coolify preview environments (team grows, PR-based reviews) - A practice gets significant traffic (thousands of daily visitors)

The migration is straightforward: move Helios containers to VPS #2, update API URLs to use vRack IP, configure Coolify or Docker Compose on VPS #2.

OOM Priority Allocation (Option A)

Lower score = higher priority (killed last under memory pressure).

Service OOM Score Memory Limit
shared_postgres -800 — (unlimited)
nginx-proxy -600
shared_redis -500 128MB
aletheia-prod-web -300 2GB
helios-prod-web -200 1GB
aletheia-staging-web -100 2GB
helios-staging-web 0 512MB
aletheia-dev-web 0 1GB
helios-dev-web 100 512MB
celery workers 200-600 256MB-2GB

3. Local Development Setup

Developer Machine

IP 192.168.0.244 is the developer's Mac on the local network — not a remote server.

Running Both Apps Locally

Mac (localhost)
├── Docker Desktop
│   └── Aletheia containers (docker compose up)
│       ├── aletheia-local-web     → localhost:8000
│       ├── postgres               → localhost:5433
│       └── redis                  → localhost:6379
└── Bare metal (no Docker)
    └── Helios (npm run dev)       → localhost:3000

Steps:

  1. Start Aletheia:

    cd ~/coding/aletheia/aletheia_v2
    docker compose up
    # Available at http://localhost:8000
    

  2. Start Helios:

    cd ~/coding/helios/helios_test2   # (or wherever Next.js code lives after B1)
    npm run dev
    # Available at http://localhost:3000
    

  3. Helios .env.local:

    ALETHEIA_API_URL=http://localhost:8000/api/v1/websites
    REVALIDATION_SECRET=dev-secret
    

Multi-Tenancy Testing

Next.js resolves practices from the Host header. Add to /etc/hosts:

127.0.0.1  cabinet-dentaire-aubagne.local
127.0.0.1  le-canet.local
127.0.0.1  cabinet-bodin.local
127.0.0.1  david-simon-thiais.local

Browse http://cabinet-dentaire-aubagne.local:3000/ — the proxy reads the Host, maps to a practice, fetches from Aletheia.

Testing Against Server Environments

To test Helios locally against staging or dev Aletheia on the server:

# .env.local — point at server instead of local
ALETHEIA_API_URL=https://aletheia-dev.groupe-suffren.com/api/v1/websites

This is useful when you want to test with real production-like data without running Aletheia locally.

ISR Webhook (Local Limitation)

The ISR revalidation webhook (Aletheia → Helios) does not work in local dev because: - Aletheia runs in Docker and can't reach localhost:3000 on the Mac host easily - The webhook URL would need to be http://host.docker.internal:3000/api/revalidate (Docker Desktop for Mac)

Workaround: Manually trigger revalidation after content changes:

curl -X POST http://localhost:3000/api/revalidate \
  -H "Content-Type: application/json" \
  -d '{"secret":"dev-secret","tags":["page:implant-dentaire-aubagne"]}'

Or simply restart npm run dev (clears all cache).

Media Files

Aletheia returns absolute media URLs in API responses (e.g., http://localhost:8000/media/...). These work directly in local dev because Aletheia serves media on the same host. In production, URLs point to the Cloudflare CDN domain.

Local Dev Is Identical Across Server Options

Regardless of whether the server uses Option A, B, or C — local development is always the same: both apps on your Mac, Helios at :3000, Aletheia at :8000, connected via localhost.


4. Shared Services

PostgreSQL 18

Single container, multiple databases. Port 127.0.0.1:5432 (localhost only on server, not exposed to network).

Database User Environment
aletheia_prod aletheia_prod Aletheia production
aletheia_staging aletheia_staging Aletheia staging
aletheia_dev aletheia_dev Aletheia development
umami umami Umami Analytics (shared)

Helios does not have its own database. It reads data from Aletheia's API.

Redis 7

Single container, --maxmemory 128mb, allkeys-lru eviction.

DB Use
0 Aletheia prod — Celery broker + result
1 Aletheia prod — cache
2 Aletheia staging — Celery broker + result
3 Aletheia staging — cache
4 Aletheia dev — Celery broker + result
5 Aletheia dev — cache
6 Helios prod — ISR cache (if needed, otherwise filesystem)
7 Helios staging — ISR cache
8 Helios dev — ISR cache
9-15 Reserved

If Helios uses Redis for ISR caching, consider increasing maxmemory to 192mb or 256mb.


5. Docker Compose Structure

Pattern (Same for Both Apps)

app/
├── docker-compose.yml              # Base: service definitions, networks
├── docker-compose.prod.yml         # Prod: OOM scores, memory limits, restart: always
├── docker-compose.staging.yml      # Staging: lower limits, restart: unless-stopped
├── docker-compose.dev.yml          # Dev-on-server: development target, lowest limits
├── docker-compose.override.yml     # Local dev: DB/Redis, source mounting, hot reload
├── Dockerfile                      # Multi-stage build
└── deploy.sh                       # Deploy script (git fetch, build, restart)

Helios-Specific

# docker-compose.yml (base)
services:
  web:
    build:
      context: .
      target: production
    container_name: helios-${ENVIRONMENT:-local}-web
    env_file:
      - path: ${ENV_FILE:-.env.local}
        required: false
    networks:
      - web
      - backend    # needed for direct API calls to Aletheia (Option A)
    expose:
      - "3000"
    volumes:
      - nextjs-cache:/app/.next/cache    # ISR file cache persistence

networks:
  web:
    external: true
  backend:
    external: true

volumes:
  nextjs-cache:

Key differences from Aletheia: - One service (no Celery, no beat, no heavy worker) - Port 3000 (not 8000) - No database connection (talks to Aletheia via HTTP) - Cache volume for ISR persistence across deploys

Helios Dockerfile (Sketch)

# Stage 1: Dependencies
FROM node:22-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci

# Stage 2: Build
FROM node:22-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build

# Stage 3: Production
FROM node:22-alpine AS production
WORKDIR /app
RUN addgroup -S nextjs && adduser -S nextjs -G nextjs
COPY --from=builder --chown=nextjs:nextjs /app/.next/standalone ./
COPY --from=builder --chown=nextjs:nextjs /app/.next/static ./.next/static
COPY --from=builder --chown=nextjs:nextjs /app/public ./public
USER nextjs
EXPOSE 3000
ENV PORT=3000 HOSTNAME="0.0.0.0"
CMD ["node", "server.js"]

Uses Next.js standalone output mode — minimal production image with only the needed files.


6. Nginx Routing

Domain Mapping

Practice codes: cda (Aubagne), vsm (Le Canet), pds (Bodin), ths (David Simon)

Domain Routes to Environment
aletheia.groupe-suffren.com aletheia-prod-web:8000 Aletheia prod
aletheia-staging.groupe-suffren.com aletheia-staging-web:8000 Aletheia staging
aletheia-dev.groupe-suffren.com aletheia-dev-web:8000 Aletheia dev
cabinet-dentaire-aubagne.fr helios-prod-web:3000 Helios prod (real domain)
(+ all practice prod domains) helios-prod-web:3000 Helios prod
cda.groupe-suffren.com helios-prod-web:3000 Helios prod preview (Aubagne)
cda-staging.groupe-suffren.com helios-staging-web:3000 Helios staging (Aubagne)
cda-dev.groupe-suffren.com helios-dev-web:3000 Helios dev (Aubagne)
vsm.groupe-suffren.com helios-prod-web:3000 Helios prod preview (Le Canet)
vsm-staging.groupe-suffren.com helios-staging-web:3000 Helios staging (Le Canet)
vsm-dev.groupe-suffren.com helios-dev-web:3000 Helios dev (Le Canet)
(same pattern for pds, ths)
monitoring.groupe-suffren.com grafana:3000 Monitoring

Helios Nginx Vhost (prod example)

server {
    listen 443 ssl;
    server_name
        cabinet-dentaire-aubagne.fr
        www.cabinet-dentaire-aubagne.fr
        le-canet.chirurgiens-dentistes.fr
        cabinet-bodin.fr
        dr-david-simon.chirurgiens-dentistes.fr
        ;
        # Add new practice domains here

    ssl_certificate     /etc/letsencrypt/live/cabinet-dentaire-aubagne.fr/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/cabinet-dentaire-aubagne.fr/privkey.pem;
    # OR: Cloudflare origin cert (single cert for all domains behind Cloudflare)

    resolver 127.0.0.11 valid=30s;

    # Next.js static assets (immutable, long cache)
    location /_next/static/ {
        set $upstream http://helios-prod-web:3000;
        proxy_pass $upstream;
        expires 365d;
        add_header Cache-Control "public, immutable";
    }

    # All requests → Next.js
    location / {
        limit_req zone=general burst=20 nodelay;
        set $upstream http://helios-prod-web:3000;
        proxy_pass $upstream;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_redirect off;
    }
}

Important: The Host header is passed through (proxy_set_header Host $host). Next.js uses this to resolve which practice to render.

SSL Strategy

Recommended: Cloudflare (already in the spec). - All practice domains point DNS to Cloudflare - Cloudflare handles SSL termination at the edge - Nginx uses a Cloudflare origin certificate (one cert covers all domains) - Simplifies cert management vs. individual Let's Encrypt certs per domain


7. Coolify Setup (Options B/C — Future)

When migrating to a second VPS with Coolify:

Coolify Project Structure

Coolify Dashboard (VPS #2)
└── Project: "Helios"
    ├── Environment: production
    │   ├── Source: GitHub baudry-suffren/helios, branch main
    │   ├── Build: Dockerfile (standalone)
    │   ├── Domains: cabinet-dentaire-aubagne.fr, le-canet.fr, ...
    │   ├── Auto-deploy: on push to main
    │   └── Env vars:
    │       ALETHEIA_API_URL=http://10.0.0.1:80/api/v1/websites  # vRack → VPS #1 nginx
    │       REVALIDATION_SECRET=<secret>
    ├── Environment: staging
    │   ├── Source: branch staging
    │   ├── Domain: staging.helios.groupe-suffren.com
    │   └── Env vars: ALETHEIA_API_URL=http://10.0.0.1:80/api-staging/...
    └── Environment: development
        ├── Source: branch develop
        ├── Domain: dev.helios.groupe-suffren.com
        └── Env vars: ALETHEIA_API_URL=http://10.0.0.1:80/api-dev/...

vRack Configuration

OVH vRack creates a private network between VPS #1 and VPS #2: - VPS #1 (Aletheia): 10.0.0.1 - VPS #2 (Helios): 10.0.0.2 - Traffic between them is private, fast, and free

VPS #1 nginx needs additional server blocks (or location blocks) to expose each Aletheia environment's API on the vRack interface for Helios to reach.

ISR Webhook (Aletheia → Helios, cross-server)

Aletheia sends webhook to Helios via vRack:

HELIOS_REVALIDATION_URL=http://10.0.0.2:3000/api/revalidate  # prod
Coolify/Traefik on VPS #2 routes this to the correct Helios container.


8. Deploy & Secrets

Option A (Makefile Deploy — Same Pattern for Both Apps)

Aletheia (existing):

make deploy ENV=prod REF=main
# → deploy.sh: git fetch, build, migrate, collectstatic, restart, nginx swap

Helios (new, simpler — no migrations, no collectstatic):

make deploy ENV=prod REF=main
# → deploy.sh: git fetch, build (includes next build in Docker), restart, nginx swap

Secret Management (SOPS + age)

Same age keypair across all repos. Key stored at /opt/docker/.age-key.txt on server.

Repo Encrypted files
Aletheia infra/envs/.env.{dev,staging,prod}.enc
Helios infra/envs/.env.{dev,staging,prod}.enc
Aether (future) shared/.env.enc, nginx/.htpasswd.enc, init-scripts/*.sql.enc

Server Directory Layout

/opt/docker/
├── nginx/                    # Reverse proxy (all vhosts)
├── shared/                   # PostgreSQL + Redis
├── aletheia/
│   ├── repo/                 # Git clone
│   ├── envs/                 # .env.dev, .env.staging, .env.prod
│   └── media/                # User uploads (dev/, staging/, prod/)
├── helios/
│   ├── repo/                 # Git clone
│   └── envs/                 # .env.dev, .env.staging, .env.prod
├── monitoring/               # Full monitoring stack
└── backups/
    ├── scripts/backup.sh
    ├── daily/
    └── weekly/

9. Monitoring

Automatic (Zero Config for New Apps)

  • Alloy discovers all Docker containers automatically via discovery.docker. Helios containers appear with labels container=helios-prod-web, compose_service=web. Logs flow to Loki automatically.
  • Docker exporter tracks CPU/memory/network for all containers. Helios appears in existing Grafana Docker dashboard automatically.

Manual Additions (When Helios Deploys)

Config file Change
prometheus/prometheus.yml Add Helios practice domains to blackbox-health and blackbox-ssl scrape targets
prometheus/alerts/helios.rules.yml New alert rules: health check failing, container restarts
grafana/provisioning/dashboards/ Optional: Helios-specific dashboard (if Next.js exports Prometheus metrics)

10. Backups

Current

Daily at 2 AM: pg_dump of aletheia_prod + aletheia_staging, tar.gz of Aletheia media. 7-day daily + 28-day weekly retention.

Changes for Helios

Add Umami database to backup loop:

# In backup.sh, extend the database list:
for DB in aletheia_prod aletheia_staging umami; do

Helios has nothing to back up: - No database (reads from Aletheia API) - No user uploads (media is in Aletheia) - Code is in git - ISR cache is ephemeral (regenerated from API data)


11. Aether Extraction Roadmap

Phase 1 — Now (Infra Stays in Aletheia Repo)

All infrastructure configs remain at ~/coding/aletheia/aletheia_v2/infra/. Helios-specific additions (nginx vhosts, env templates) are added there. This is pragmatic: - The Makefile, setup.sh, and SOPS workflow all work today - No refactoring needed before Helios has code

Phase 2 — After First Helios Deploy (Create Aether Repo)

Create baudry-suffren/aether repo. Extract from aletheia_v2/infra/:

Goes to Aether Stays in Aletheia Goes in Helios
nginx/ (all vhosts, nginx.conf) docker-compose.yml + overrides docker-compose.yml + overrides
shared/ (postgres, redis, init scripts) Dockerfile Dockerfile
monitoring/ (full stack) deploy.sh deploy.sh
backups/ Makefile (app targets) Makefile (app targets)
security/ (SSH, firewall, fail2ban, sysctl) infra/envs/ (Aletheia env files) infra/envs/ (Helios env files)
setup.sh
.sops.yaml
Makefile (infra targets: infra-diff, infra-deploy, infra-pull, encrypt, decrypt)

On the server, Aether clones to /opt/docker/aether-repo/. The infra Makefile runs from there.

Phase 2 Trigger

Create Aether when you find yourself editing nginx configs or monitoring rules from the Aletheia repo and it feels wrong. Don't extract proactively — wait until the coupling is actually annoying.


12. Changelog

Track infrastructure changes that affect either app. Format: date, what changed, why, which apps affected.

(No changes yet — document created 2026-03-27)