Skip to content

feat: add backup and restore scripts with steps #31

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: shutter-api
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 10 additions & 1 deletion _container_scripts/keyper-db-init.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,13 @@

set -e

createdb -U postgres keyper
echo "Checking for backup dump file..."
if [ -f "/var/lib/postgresql/dump/keyper.dump" ]; then
echo "Backup dump found, restoring database with full schema and data..."
pg_restore -U postgres -d postgres --create --clean -v /var/lib/postgresql/dump/keyper.dump
rm -f /var/lib/postgresql/dump/keyper.dump
echo "Database restore completed."
else
echo "No backup dump file found, creating fresh database..."
createdb -U postgres keyper
fi
1 change: 1 addition & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ services:
volumes:
- ./data/db:/var/lib/postgresql/data
- ./_container_scripts/keyper-db-init.sh:/docker-entrypoint-initdb.d/keyper-db-init.sh:ro
- ./data/db-dump:/var/lib/postgresql/dump
healthcheck:
test: pg_isready -U postgres
start_period: "30s"
Expand Down
72 changes: 72 additions & 0 deletions scripts/RESTORE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# Backup and Restore Guide

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably mention how/where to store the backup (ideally on a different machine), that it doesn't contain all data needed to recover (the signing key is missing), and that it still contains sensitive information.


## Backup Process

### Creating a Backup

1. **Ensure services are running** - The backup process requires the database to be accessible
2. **Run the backup script**:
```bash
./scripts/backup.sh
```
3. **Backup location** - Backups are stored in `data/backups/` directory
4. **Backup naming** - Files are named with timestamp: `shutter-api-keyper-YYYY-MM-DDTHH-MM-SS.tar.xz`

### What Gets Backed Up

- Database dump (`keyper.dump`) - Contains full schema and data from the `keyper` database
- Chain data (`data/chain/`) - Blockchain data and configuration
- Keyper configuration (`config/`) - Application configuration files
- Environment variables - Except Signing Key

## Restore Process

### Prerequisites

- **Empty keyper instance** - The restore *must* be performed on a fresh, empty deployment
- **No running services** - Ensure all Docker containers are stopped before restore
- **Backup file available** - The backup archive should be present in `data/backups/` directory

### Restore Steps

1. **Setup environment**:
```bash
cp example-api.env .env
# Edit .env with your configuration values
```

2. **Run restore script**:
```bash
./scripts/restore.sh
```
- This will automatically find the latest backup in `data/backups/`
- Prompts for confirmation before proceeding
- Restores all data to appropriate locations

3. **Set the Signing Key**:
- After restoring, update the `.env` file by setting the `SIGNING_KEY` environment variable to the same value used in your original deployment.

4. **Start services**:
```bash
docker compose up -d
```

### Restore Locations

- **Database**: `data/db-dump/keyper.dump` - Automatically restored to PostgreSQL
- **Chain data**: `data/chain/` - Keyper chain data and configuration
- **Configuration**: `config/` - Application configuration files
- **Environment**: `.env` - Updated with restored metrics settings

### Important Notes

- **Database restoration** - The database is automatically restored on first startup via the initialization script
- **Service order** - Restore must be completed before starting any services
- **Data integrity** - The restore process overwrites existing data; ensure you have a clean instance
- **Configuration review** - Review restored configuration files before starting services

### Troubleshooting

- **No backup found** - Ensure backup files exist in `data/backups/` directory
- **Permission errors** - Ensure proper file permissions on backup files
- **Configuration issues** - Verify that restored configuration files are valid
70 changes: 70 additions & 0 deletions scripts/backup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
#!/usr/bin/env bash

set -euo pipefail

R='\033[0;31m'
G='\033[0;32m'
Y='\033[0;33m'
B='\033[0;34m'
DEF='\033[0m'

SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )
ARCHIVE_NAME="shutter-api-keyper-$(date +%Y-%m-%dT%H-%M-%S).tar.xz"

source "${SCRIPT_DIR}/../.env"

WORKDIR=$(mktemp -d)

cleanup() {
rv=$?
set +e
echo -e "${R}Unexpected error, exit code: $rv, cleaning up.${DEF}"
docker compose unpause || true
exit $rv
}

trap cleanup EXIT

echo -e "${G}Creating backup archive${DEF}"

mkdir -p "${SCRIPT_DIR}/../data/backups"

echo -e "${B}[1/7] Pausing all services except database...${DEF}"
docker compose pause keyper || true

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this swallows errors, shouldn't we abort in that case? Same for the other pause commands below and for the down command in the recovery script.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered stopping the containers instead of pausing them? This would leave them in a more well defined state (e.g. not in the middle of a db transaction or an RPC call).

docker compose pause chain || true

echo -e "${B}[2/7] Creating database dump...${DEF}"
docker compose exec db pg_dump -U postgres -d keyper -Fc --create --clean -f /var/lib/postgresql/data/keyper.dump

echo -e "${B}[3/7] Pausing database...${DEF}"
docker compose pause db || true

echo -e "${B}[4/7] Copying data...${DEF}"
cp -a "${SCRIPT_DIR}/../data/chain/" "${WORKDIR}/chain"
cp -a "${SCRIPT_DIR}/../data/db/keyper.dump" "${WORKDIR}/keyper.dump"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICS the source path is wrong

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the backup is created in the db folder only as previous docker setup, would not have new volume mapped in the compose (db-dump). The idea is to store the dump in the same dir where postgres data is stored and then copy it to db-dump at the time of restore. LMK, if needs more clarification

cp -a "${SCRIPT_DIR}/../config" "${WORKDIR}/keyper-config"

mkdir -p "${WORKDIR}/env-config"
# Copy the entire .env file but replace the private key value with a placeholder
if [ -f "${SCRIPT_DIR}/../.env" ]; then
sed 's/^SIGNING_KEY=.*/SIGNING_KEY=PLACEHOLDER_REPLACE_WITH_YOUR_PRIVATE_KEY/' "${SCRIPT_DIR}/../.env" > "${WORKDIR}/env-config/.env"
echo -e "${G}✓ Environment configuration backed up (private key replaced with placeholder)${DEF}"
else
echo -e "${Y}⚠ .env file not found, skipping environment backup${DEF}"
fi

echo -e "${B}[5/7] Resuming services...${DEF}"
docker compose unpause || true

echo -e "${B}[6/7] Compressing archive...${DEF}"
docker run --rm -it -v "${WORKDIR}:/workdir" -v "${SCRIPT_DIR}/../data:/data" alpine:3.20.1 ash -c "apk -q --no-progress --no-cache add xz pv && tar -cf - -C /workdir . | pv -petabs \$(du -sb /workdir | cut -f 1) | xz -zq > /data/backups/${ARCHIVE_NAME}"

echo -e "${B}[7/7] Cleaning up...${DEF}"
rm -rf "$WORKDIR"

echo -e "${G}Done, backup archive created at ${B}data/backups/${ARCHIVE_NAME}${DEF}"

echo -e "\n\n${R}WARNING, IMPORTANT!${DEF}"
echo -e "${Y}If you import this backup, make sure to stop this deployment first!${DEF}"

trap - EXIT
120 changes: 120 additions & 0 deletions scripts/restore.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
#!/usr/bin/env bash

set -euo pipefail

R='\033[0;31m'
G='\033[0;32m'
Y='\033[0;33m'
B='\033[0;34m'
DEF='\033[0m'

SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )
BACKUPS_DIR="${SCRIPT_DIR}/../data/backups"

WORKDIR=$(mktemp -d)

cleanup() {
rv=$?
set +e
echo -e "${R}Unexpected error, exit code: $rv, cleaning up.${DEF}"
rm -rf "$WORKDIR" || true
exit $rv
}

trap cleanup EXIT

echo -e "${G}Restoring from latest backup${DEF}"

if [ ! -d "$BACKUPS_DIR" ]; then

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine with me as is, but just as a note in my mind it would be more natural and easier to take the backup path as an argument instead of to try and find it automatically.

echo -e "${R}Error: Backups directory not found at $BACKUPS_DIR${DEF}"
exit 1
fi

LATEST_BACKUP=$(find "$BACKUPS_DIR" -name "shutter-api-keyper-*.tar.xz" -type f | sort | tail -n 1)

if [ -z "$LATEST_BACKUP" ]; then
echo -e "${R}Error: No backup files found in $BACKUPS_DIR${DEF}"
exit 1
fi

echo -e "${B}Found latest backup: ${Y}$(basename "$LATEST_BACKUP")${DEF}"

echo -e "${Y}WARNING: This will overwrite existing data!${DEF}"
read -p "Are you sure you want to continue? (y/N): " -n 1 -r
echo
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
echo -e "${R}Restore cancelled.${DEF}"
exit 0

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will still trigger the cleanup trap and print the "Unexpected error" message. In general, I think it's the cleaner approach to always run the cleanup trap even on successful exits (i.e. not disable the trap in the last line of the script), but check the error code in the cleanup function to distinguish between expected and unexpected exits. Then you also don't have to duplicate the cleanup code.

fi

echo -e "${B}[1/6] Stopping services...${DEF}"
docker compose down || true

echo -e "${B}[2/6] Extracting backup archive...${DEF}"
docker run --rm -v "$LATEST_BACKUP:/backup.tar.xz:ro" -v "$WORKDIR:/extract" alpine:3.20.1 ash -c "apk -q --no-progress --no-cache add xz && tar -xf /backup.tar.xz -C /extract"

echo -e "${B}[3/6] Restoring chain data...${DEF}"
if [ -d "$WORKDIR/chain" ]; then

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good that we check the directory exists, but we should do so for all required parts (chain, keyper-config, keyper.dump) at once before starting to overwrite. Otherwise we end up more likely in a corrupt, partially recovered state.

mkdir -p "${SCRIPT_DIR}/../data/chain"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line seems unnecessary, given that the directory is removed immediately in the next line (that rm -rf doesn't care if the directory exists or not). Similarly below

rm -rf "${SCRIPT_DIR}/../data/chain"
cp -a "$WORKDIR/chain" "${SCRIPT_DIR}/../data/chain"
echo -e "${G}✓ Chain data restored${DEF}"
else
echo -e "${Y}⚠ No chain data found in backup${DEF}"
exit 1
fi

echo -e "${B}[4/6] Restoring keyper configuration...${DEF}"
if [ -d "$WORKDIR/keyper-config" ]; then
mkdir -p "${SCRIPT_DIR}/../config"
rm -rf "${SCRIPT_DIR}/../config"
cp -a "$WORKDIR/keyper-config" "${SCRIPT_DIR}/../config"
echo -e "${G}✓ Keyper configuration restored${DEF}"
else
echo -e "${Y}⚠ No keyper-config found in backup${DEF}"
exit 1
fi

echo -e "${B}[5/6] Restoring database dump...${DEF}"
if [ -f "$WORKDIR/keyper.dump" ]; then
mkdir -p "${SCRIPT_DIR}/../data/db-dump"
cp "$WORKDIR/keyper.dump" "${SCRIPT_DIR}/../data/db-dump/keyper.dump"
echo -e "${G}✓ Database dump restored${DEF}"
else
echo -e "${Y}⚠ No database dump found in backup${DEF}"
exit 1
fi

echo -e "${B}[6/6] Restoring environment configuration...${DEF}"
if [ -f "$WORKDIR/env-config/.env" ]; then
if [ -f "${SCRIPT_DIR}/../.env" ]; then
cp "${SCRIPT_DIR}/../.env" "${SCRIPT_DIR}/../.env.backup.$(date +%Y%m%d_%H%M%S)"

CURRENT_SIGNING_KEY=$(grep '^SIGNING_KEY=' "${SCRIPT_DIR}/../.env" 2>/dev/null || echo "")

cp "$WORKDIR/env-config/.env" "${SCRIPT_DIR}/../.env"

if [ -n "$CURRENT_SIGNING_KEY" ]; then
echo "$CURRENT_SIGNING_KEY" >> "${SCRIPT_DIR}/../.env"
fi

echo -e "${G}✓ Environment configuration restored (private key preserved)${DEF}"
else
cp "$WORKDIR/env-config/.env" "${SCRIPT_DIR}/../.env"
echo -e "${G}✓ Environment configuration restored${DEF}"
echo -e "${Y}⚠ No existing SIGNING_KEY found, you'll need to set it manually${DEF}"
fi
else
echo -e "${Y}⚠ No env-config/.env found in backup${DEF}"
fi

echo -e "${B}Cleaning up...${DEF}"
rm -rf "$WORKDIR"

echo -e "${G}Restore completed successfully!${DEF}"
echo -e "${Y}Next steps:${DEF}"
echo -e "1. Review the restored configuration files"
echo -e "2. Start the services: ${B}docker compose up -d${DEF}"
echo -e "3. The database will be automatically restored on first startup"

trap - EXIT