Backups & Restore
pgBackRest integration for automated backups, WAL archiving, snapshots, and point-in-time recovery in AxiomDB
Backups & Restore
AxiomDB uses pgBackRest for enterprise-grade backup and recovery. This guide covers the full backup lifecycle: daily full backups, hourly incremental backups, WAL archiving, named snapshots, and restore procedures.
Backup Strategy
Daily fulls, hourly incrementals, and continuous WAL archiving
pgBackRest Setup
Stanza configuration, repository setup, and encryption
Running Backups
Manual and automated backup commands
Snapshots
Creating and managing named restore points
Restore
Point-in-time recovery and branch restoration
Verification
Integrity checks, test restores, and monitoring
Backup Strategy
AxiomDB implements a 3-tier backup strategy:
┌─────────────────────────────────────────────────────────────┐
│ AxiomDB Backup Pipeline │
├─────────────────────────────────────────────────────────────┤
│ │
│ Tier 1: WAL Archiving (Continuous) │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ PostgreSQL → WAL files → pgBackRest archive-push │ │
│ │ RPO: ~0 (seconds of data loss) │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
│ Tier 2: Incremental Backups (Hourly) │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ pgBackRest --type=diff (changed blocks only) │ │
│ │ RPO: 1 hour │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
│ Tier 3: Full Backups (Daily) │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ pgBackRest --type=full (complete database copy) │ │
│ │ RPO: 24 hours (but WAL covers gaps) │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
│ Named Snapshots (On-Demand) │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ pgBackRest restore point markers │ │
│ │ Used for pre-migration safety nets │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘Recovery Point Objective (RPO)
| Backup Type | Frequency | RPO | Storage Cost | Restore Speed |
|---|---|---|---|---|
| WAL Archive | Continuous | Seconds | Medium | Slow (replay) |
| Incremental | Hourly | 1 hour | Low | Medium |
| Full | Daily | 24 hours | High | Fast |
| Snapshot | On-demand | Point-in-time | Low | Fast |
pgBackRest Configuration
Stanza Configuration
The pgBackRest stanza defines connection details, backup behavior, and repository location.
# /etc/pgbackrest/pgbackrest.conf
[global]
repo1-path=/var/lib/pgbackrest
repo1-retention-full=7
repo1-retention-diff=24
repo1-cipher-type=aes-256-cbc
repo1-cipher-pass=${BACKREST_CIPHER_PASS}
repo1-bundle=y
repo1-bundle-limit=2048MB
compress-type=zst
compress-level=6
process-max=4
log-level-console=info
log-level-file=detail
start-fast=y
delta=y
[axiomdb]
pg1-path=/var/lib/postgresql/14/main
pg1-port=5432
pg1-user=postgres
pg1-socket-path=/var/run/postgresql
archive-command='pgbackrest --stanza=axiomdb archive-push %p'
archive-mode=yEncryption
All backups are encrypted at rest using AES-256-CBC. The cipher passphrase is stored in the AxiomDB secrets manager and injected via environment variable. Never store the passphrase in plaintext configuration files.
Initialize the Stanza
# Create the stanza
pgbackrest --stanza=axiomdb stanza-create
# Verify configuration
pgbackrest --stanza=axiomdb checkPostgreSQL WAL Archive Setup
-- postgresql.conf settings
ALTER SYSTEM SET archive_mode = 'on';
ALTER SYSTEM SET archive_command = 'pgbackrest --stanza=axiomdb archive-push %p';
ALTER SYSTEM SET wal_level = 'replica';
ALTER SYSTEM SET max_wal_senders = 5;
-- Reload configuration
SELECT pg_reload_conf();Archive Mode Requires Restart
Changing archive_mode requires a PostgreSQL restart, not just a reload. Plan this during a maintenance window.
Running Backups
Manual Backup Commands
# Full backup (complete copy of all databases)
pgbackrest --stanza=axiomdb --type=full backup
# Incremental backup (only changed blocks since last backup)
pgbackrest --stanza=axiomdb --type=diff backup
# Differential backup (changes since last full backup)
pgbackrest --stanza=axiomdb --type=incr backupAutomated Backup Schedule
# /etc/cron.d/pgbackrest
# Full backup daily at 2:00 AM
0 2 * * * postgres pgbackrest --stanza=axiomdb --type=full backup
# Incremental backup every hour
0 * * * * postgres pgbackrest --stanza=axiomdb --type=diff backup
# Verify backup integrity daily at 4:00 AM
0 4 * * * postgres pgbackrest --stanza=axiomdb verifyBackup Output
2025-01-15 02:00:01.000 P00 INFO: backup command begin 2.51: ...
2025-01-15 02:00:01.100 P00 INFO: execute exclusive pg_start_backup()
2025-01-15 02:00:02.200 P00 INFO: backup start = 0/4000028, lsn = 0/4000028
2025-01-15 02:00:05.500 P00 INFO: check archive for segment 0/4000028
2025-01-15 02:05:30.000 P00 INFO: new backup label = 20250115-020001F
2025-01-15 02:05:31.000 P00 INFO: full backup size = 2.1GB
2025-01-15 02:05:31.100 P00 INFO: new backup size = 2.1GB, file total = 1847
2025-01-15 02:05:31.200 P00 INFO: backup command end: completed successfullyBackup Status Check
# View all backups
pgbackrest --stanza=axiomdb info
# Example output:
# stanza: axiomdb
# status: ok
# cipher: aes-256-cbc
# db (current)
# wal archive min/max (14): 000000010000000000000001/000000010000000000000042
#
# full backup: 20250115-020001F
# timestamp start/stop: 2025-01-15 02:00:01+00 / 2025-01-15 02:05:31+00
# wal start/stop: 000000010000000000000001 / 000000010000000000000001
# database size: 2.1GB, database backup size: 2.1GB
# repository size: 650MB, repository backup size: 648MB
#
# diff backup: 20250115-020001F_20250115-030000D
# timestamp start/stop: 2025-01-15 03:00:00+00 / 2025-01-15 03:01:15+00
# wal start/stop: 000000010000000000000002 / 000000010000000000000002
# database size: 2.1GB, database backup size: 120MB
# repository size: 652MB, repository backup size: 45MBNamed Snapshots
Snapshots are named restore points that allow you to restore to a specific moment, typically used before risky operations like schema migrations.
Creating a Snapshot
# Create a named snapshot before a migration
pgbackrest --stanza=axiomdb --type=full \
--annotation=pre-migration-v2.3 \
--annotation=branch=main \
backup-- Also create a PostgreSQL restore point for WAL-level recovery
SELECT pg_create_restore_point('pre-migration-v2.3');Listing Snapshots
# List all backups (including annotated snapshots)
pgbackrest --stanza=axiomdb info --output=json{
"stanza": "axiomdb",
"status": {
"code": 0,
"message": "ok"
},
"db": [
{
"id": 1,
"system-id": 7312345678901234567,
"version": 14
}
],
"backup": [
{
"type": "full",
"label": "20250115-020001F",
"timestamp": {
"start": 1736905201,
"stop": 1736905531
},
"annotation": {
"pre-migration-v2.3": "true",
"branch": "main"
},
"info": {
"size": 2254857830,
"delta": 2254857830
}
}
]
}Snapshot Retention
# Keep specific snapshots indefinitely by using archive retention
pgbackrest --stanza=axiomdb expire
# Manual retention override
pgbackrest --stanza=axiomdb --repo1-retention-full=30 expireSnapshot Best Practices
Always create a snapshot before: (1) major schema migrations, (2) bulk data operations, (3) dependency upgrades, (4) production deployments. Name snapshots descriptively: pre-migration-{version}, pre-deploy-{sha}, pre-bulk-update-{date}.
Restore Procedures
Restore into a New Branch (Preferred)
The safest restore method creates a new branch database from the backup, leaving the original intact.
# Step 1: Create a new branch via AxiomDB Gateway
curl -X POST http://127.0.0.1:4060/api/branches \
-H "Content-Type: application/json" \
-d '{
"name": "restored-main-20250115",
"source_branch": "main",
"restore_point": "20250115-020001F"
}'# Step 2: Verify the restored branch
psql -h 127.0.0.1 -p 5432 -U axiomdb_restored_main_20250115 \
-d restored_main_20250115 -c "SELECT count(*) FROM information_schema.tables;"Point-in-Time Recovery (PITR)
Restore to a specific timestamp using WAL replay:
# Stop PostgreSQL
sudo systemctl stop postgresql
# Restore the base backup
pgbackrest --stanza=axiomdb \
--type=time \
--target="2025-01-15 14:30:00+00" \
--target-action=promote \
restore
# Start PostgreSQL
sudo systemctl start postgresql
# Verify
psql -h 127.0.0.1 -p 5432 -U axiomdb -c "SELECT now();"Restore to Named Restore Point
pgbackrest --stanza=axiomdb \
--type=name \
--target="pre-migration-v2.3" \
--target-action=promote \
restoreRestore Specific Database Only
# Restore only a specific database from the backup
pgbackrest --stanza=axiomdb \
--type=full \
--target-db-name=branch_xyz \
--db-include=branch_xyz \
restoreRestore Checklist
Before initiating any restore:
□ Identify the exact restore point (timestamp, backup label, or snapshot name)
□ Confirm the target branch name and whether to create a new branch
□ Verify backup availability: pgbackrest --stanza=axiomdb info
□ Check available disk space: df -h /var/lib/postgresql
□ Notify stakeholders of expected downtime (if restoring in-place)
□ Document the restore reason in the operations log
□ Test restore on a non-production branch first (if time permits)After restore completion:
□ Verify database connectivity: psql -c "SELECT 1;"
□ Check table counts match expectations
□ Run application health checks
□ Verify recent migrations are present: SELECT * FROM _prisma_migrations ORDER BY started_at DESC LIMIT 10;
□ Test critical application endpoints
□ Monitor error rates for 30 minutes
□ Update status page / notify stakeholders
□ Document the restore outcomeBackup Verification
Integrity Checks
# Verify backup integrity (checks checksums)
pgbackrest --stanza=axiomdb verify
# Check for corruption in the repository
pgbackrest --stanza=axiomdb repo-get latest/base/backup.manifestTest Restore
# Restore to a temporary cluster for testing
pgbackrest --stanza=axiomdb \
--type=full \
--pg1-path=/tmp/test-restore \
--target-action=promote \
restore
# Start temporary instance
pg_ctl -D /tmp/test-restore start -o "-p 15432"
# Verify
psql -h 127.0.0.1 -p 15432 -U axiomdb -c "SELECT count(*) FROM information_schema.tables;"
# Cleanup
pg_ctl -D /tmp/test-restore stop
rm -rf /tmp/test-restoreAutomated Verification Script
#!/bin/bash
# /opt/axiomdb/scripts/verify-backups.sh
set -euo pipefail
STANZA="axiomdb"
LOG_FILE="/var/log/axiomdb/backup-verify.log"
WEBHOOK_URL="https://hooks.slack.com/services/xxx"
log() { echo "[$(date -Iseconds)] $*" | tee -a "$LOG_FILE"; }
# Check backup freshness
LAST_BACKUP=$(pgbackrest --stanza="$STANZA" info --output=json | \
jq -r '.backup[-1].timestamp.start')
BACKUP_AGE=$(( $(date +%s) - LAST_BACKUP ))
if [ "$BACKUP_AGE" -gt 90000 ]; then # 25 hours
log "CRITICAL: Last backup is $((BACKUP_AGE / 3600)) hours old"
curl -s -X POST "$WEBHOOK_URL" -d "{\"text\":\"🔴 Backup is $((BACKUP_AGE / 3600))h old\"}"
exit 1
fi
# Verify backup integrity
if ! pgbackrest --stanza="$STANZA" verify >> "$LOG_FILE" 2>&1; then
log "CRITICAL: Backup verification failed"
curl -s -X POST "$WEBHOOK_URL" -d "{\"text\":\"🔴 Backup verification failed\"}"
exit 1
fi
log "OK: Backup verified, age $((BACKUP_AGE / 3600))h"
exit 0Disaster Recovery
Full Site Recovery
If the entire VPS is lost:
- Provision a new VPS with the same OS and disk layout
- Install PostgreSQL, PgBouncer, pgBackRest, and AxiomDB components
- Restore pgBackRest configuration and cipher key
- Run
pgbackrest --stanza=axiomdb --type=latest restore - Start PostgreSQL and verify data integrity
- Restart AxiomDB Gateway and Ops Console
- Update DNS if needed
Recovery Time Objective (RTO)
| Scenario | Expected RTO | Notes |
|---|---|---|
| Single database restore | 5–15 min | New branch from backup |
| Point-in-time recovery | 15–30 min | WAL replay required |
| Full site recovery | 1–2 hours | Requires infrastructure rebuild |
| Cross-region restore | 2–4 hours | Network transfer time |
Test Your DR Plan
Run a full disaster recovery drill quarterly. Document the results, time taken, and any issues encountered. A backup that has never been tested is not a backup.