Disaster Recovery¶
Overview¶
Full server rebuild procedure — from a blank Debian 12+ VPS to a running production environment with restored data.
Prerequisites
- Access to the Aether repo (GitHub: baudry-suffren/aether)
- The age key file (stored securely off-server)
- A recent database backup
Recovery Steps¶
1. Provision a new VPS¶
- Debian 12+ (Bookworm or later)
- Minimum: 4 vCPU, 8 GB RAM, 80 GB SSD
- Add your SSH public key during provisioning
2. Initial server setup¶
# SSH in and clone Aether
git clone git@github.com:baudry-suffren/aether.git /opt/docker/aether/repo
cd /opt/docker/aether/repo
# Run the setup script (installs Docker, security hardening, etc.)
sudo ./setup.sh
3. Restore secrets¶
Copy the age key to the server, then decrypt all secrets:
# Copy age key (from secure storage)
sudo cp /path/to/age-key.txt /opt/docker/.age-key.txt
sudo chmod 600 /opt/docker/.age-key.txt
# Decrypt all secrets to their server locations
make decrypt
4. Start shared services¶
5. Restore database¶
# Copy backup file to server, then restore
docker exec -i shared_postgres psql -U admin < backup.sql
6. Clone and deploy applications¶
# Aletheia
git clone git@github.com:baudry-suffren/aletheia_v2.git /opt/docker/aletheia/repo
cd /opt/docker/aletheia/repo && make deploy ENV=prod
# Helios
git clone git@github.com:baudry-suffren/helios.git /opt/docker/helios/repo
cd /opt/docker/helios/repo && make deploy ENV=prod
7. Deploy infrastructure services¶
cd /opt/docker/aether/repo
make deploy # Copies all configs
make restart # Starts nginx, monitoring, umami
8. SSL certificates¶
# Obtain certificates for all domains
docker exec certbot certbot certonly --webroot -w /var/www/certbot \
--non-interactive --agree-tos --email admin@groupe-suffren.com \
-d aletheia.groupe-suffren.com \
-d monitoring.groupe-suffren.com \
-d analytics.groupe-suffren.com \
-d docs.groupe-suffren.com
9. Verify¶
- [ ] All containers running:
docker ps - [ ] Aletheia accessible:
curl -sI https://aletheia.groupe-suffren.com - [ ] Practice websites loading: check each domain
- [ ] Monitoring operational:
https://monitoring.groupe-suffren.com - [ ] Backups scheduled:
crontab -l - [ ] DNS pointing to new server IP
Testing DR¶
Use test-dr.sh in the Aether repo to validate the DR procedure in a non-destructive way:
Recovery Time Objective (RTO)¶
Estimated recovery time: 1-2 hours from a blank VPS with all backups available.
Backup Locations¶
| Data | Location | Frequency |
|---|---|---|
| Database (PostgreSQL) | /opt/docker/backups/daily/ |
Daily |
| Encrypted secrets | Git (.enc files in Aether repo) |
On change |
| Application code | GitHub repos | On push |
| Media files | /opt/docker/aletheia/repo/media/ |
Not currently backed up off-site |
Gap: Off-site backups
Database backups currently stay on the same server. See the infra-offsite-backup roadmap item for the planned off-site backup strategy.