Gsd-skill-creator openstack-backup

OpenStack backup operations skill for protecting cloud infrastructure through systematic backup strategies and disaster recovery procedures. Covers database backups (MariaDB full and incremental with mariabackup), configuration backups (globals.yml, inventory, Fernet keys), volume snapshots (Cinder LVM snapshots), image exports (Glance), instance snapshots (Nova), backup encryption (GPG/OpenSSL), retention policies (daily/weekly/monthly rotation), restore procedures (database point-in-time recovery, service rebuild), RPO/RTO planning, and disaster recovery drills. Use when planning backup strategy, scheduling automated backups, testing restore procedures, or executing disaster recovery.

install
source · Clone the upstream repo
git clone https://github.com/Tibsfox/gsd-skill-creator
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/Tibsfox/gsd-skill-creator "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/openstack/backup" ~/.claude/skills/tibsfox-gsd-skill-creator-openstack-backup && rm -rf "$T"
manifest: skills/openstack/backup/SKILL.md
source content

OpenStack Backup Operations

A cloud without tested backups is a cloud waiting to lose data. Backup operations protect the entire stack -- from the databases that store state, to the configurations that define behavior, to the volumes that hold user data. The backup hierarchy ensures that any component can be recovered independently, from a single Keystone Fernet key to a complete cloud rebuild.

Backup planning starts with two numbers: RPO (Recovery Point Objective) -- how much data loss is acceptable (e.g., "no more than 1 hour of changes"), and RTO (Recovery Time Objective) -- how long recovery can take (e.g., "service restored within 4 hours"). These numbers drive every decision: backup frequency, storage location, retention depth, and automation investment.

In NASA SE terms, backup maps to Phase E (Operations & Sustainment) as a core sustainment activity, and Phase F (Closeout) for data archive procedures. The backup hierarchy follows the same defense-in-depth philosophy NASA applies to mission-critical data: multiple copies, multiple formats, verified recoverability.

Deploy

Backup Infrastructure Setup

Backup scripts location:

# Create backup directory structure
mkdir -p /opt/openstack-backups/{database,config,volumes,images,snapshots}
mkdir -p /opt/openstack-backups/scripts
mkdir -p /opt/openstack-backups/logs

# Set ownership and permissions
chown -R root:root /opt/openstack-backups
chmod 700 /opt/openstack-backups

Kolla-Ansible built-in database backup:

# MariaDB backup (built into Kolla-Ansible)
kolla-ansible -i inventory mariadb_backup

# Backup location: /var/lib/docker/volumes/mariadb/_data/backups/
# File: mysqlbackup-<timestamp>.qp.xb.gz

Cron scheduling for automated backups:

# /etc/cron.d/openstack-backup
# Daily database backup at 02:00
0 2 * * * root /opt/openstack-backups/scripts/backup-databases.sh >> /opt/openstack-backups/logs/db-backup.log 2>&1

# Daily config backup at 02:30
30 2 * * * root /opt/openstack-backups/scripts/backup-configs.sh >> /opt/openstack-backups/logs/config-backup.log 2>&1

# Weekly volume snapshots (Sunday 03:00)
0 3 * * 0 root /opt/openstack-backups/scripts/backup-volumes.sh >> /opt/openstack-backups/logs/volume-backup.log 2>&1

Backup storage targets:

TargetUse CaseConfiguration
Local diskFast restore, primary copy
/opt/openstack-backups/
NFS mountOff-host copy, disaster recoveryMount at
/mnt/backup-nfs/
Swift (S3-compatible)Object storage archive, long-term retentionUse
openstack object create
or s3cmd

Configure

MariaDB Backup Strategy

Full backup (weekly):

#!/bin/bash
# backup-databases.sh -- full MariaDB backup with mariabackup
BACKUP_DIR="/opt/openstack-backups/database"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_FILE="${BACKUP_DIR}/full_${TIMESTAMP}"

# Use Kolla-Ansible's built-in backup
kolla-ansible -i /etc/kolla/inventory mariadb_backup

# Or manual mariabackup via container
docker exec mariadb mariabackup --backup \
  --target-dir=/backup/full_${TIMESTAMP} \
  --user=root --password=${DB_ROOT_PASSWORD}

echo "Full backup completed: ${BACKUP_FILE}" >> /opt/openstack-backups/logs/db-backup.log

Incremental backup (daily, between full backups):

# Incremental based on last full backup
LAST_FULL=$(ls -d ${BACKUP_DIR}/full_* | sort | tail -1)
docker exec mariadb mariabackup --backup \
  --target-dir=/backup/inc_${TIMESTAMP} \
  --incremental-basedir=${LAST_FULL} \
  --user=root --password=${DB_ROOT_PASSWORD}

Service-Specific Backups

RabbitMQ definitions export:

# Export queue definitions, exchanges, bindings
docker exec rabbitmq rabbitmqctl export_definitions /tmp/rabbitmq_definitions.json
docker cp rabbitmq:/tmp/rabbitmq_definitions.json \
  /opt/openstack-backups/config/rabbitmq_definitions_$(date +%Y%m%d).json

Keystone Fernet key backup:

# Critical: Fernet keys are required to validate existing tokens
docker cp keystone_api:/etc/kolla/keystone/fernet-keys/ \
  /opt/openstack-backups/config/fernet-keys_$(date +%Y%m%d)/

Cinder volume snapshot policies:

# Create snapshot of all in-use volumes
for vol_id in $(openstack volume list --status in-use -f value -c ID); do
  openstack volume snapshot create --volume ${vol_id} \
    --description "Scheduled backup $(date +%Y%m%d)" \
    backup_${vol_id}_$(date +%Y%m%d)
done

Glance image export:

# Export critical images (base OS, golden images)
for image_id in $(openstack image list --status active -f value -c ID); do
  openstack image save --file /opt/openstack-backups/images/${image_id}.raw ${image_id}
done

Nova instance snapshot:

# Snapshot critical instances
openstack server image create --name "backup_$(date +%Y%m%d)" <instance-id>

Configuration File Backup

#!/bin/bash
# backup-configs.sh -- backup all Kolla-Ansible configurations
BACKUP_DIR="/opt/openstack-backups/config/kolla_$(date +%Y%m%d)"
mkdir -p ${BACKUP_DIR}

# Core configuration files
cp /etc/kolla/globals.yml ${BACKUP_DIR}/
cp /etc/kolla/passwords.yml ${BACKUP_DIR}/
cp -r /etc/kolla/inventory ${BACKUP_DIR}/
cp /etc/kolla/admin-openrc.sh ${BACKUP_DIR}/

# Per-service configuration overrides
for svc in keystone nova neutron cinder glance swift heat horizon; do
  [ -d /etc/kolla/${svc} ] && cp -r /etc/kolla/${svc} ${BACKUP_DIR}/
done

# Encrypt sensitive files
gpg --symmetric --cipher-algo AES256 \
  --output ${BACKUP_DIR}/passwords.yml.gpg \
  ${BACKUP_DIR}/passwords.yml
rm ${BACKUP_DIR}/passwords.yml

echo "Config backup completed: ${BACKUP_DIR}"

Backup Encryption

GPG encryption for backup files:

# Encrypt a backup file
gpg --symmetric --cipher-algo AES256 \
  --batch --passphrase-file /root/.backup-passphrase \
  --output backup.tar.gz.gpg backup.tar.gz

# Decrypt
gpg --decrypt --batch --passphrase-file /root/.backup-passphrase \
  --output backup.tar.gz backup.tar.gz.gpg

OpenSSL alternative:

# Encrypt
openssl enc -aes-256-cbc -salt -pbkdf2 \
  -in backup.tar.gz -out backup.tar.gz.enc \
  -pass file:/root/.backup-passphrase

# Decrypt
openssl enc -aes-256-cbc -d -pbkdf2 \
  -in backup.tar.gz.enc -out backup.tar.gz \
  -pass file:/root/.backup-passphrase

Retention Policies

TierFrequencyRetentionStorage Target
DailyEvery day at 02:007 daysLocal disk
WeeklySunday 03:00 (full DB)4 weeksNFS mount
MonthlyFirst Sunday of month12 monthsSwift/S3 archive

Rotation script:

# Purge backups older than retention period
find /opt/openstack-backups/database -name "*.gz" -mtime +7 -delete
find /mnt/backup-nfs/database -name "*.gz" -mtime +28 -delete
# Monthly backups on Swift are managed by object expiry headers

Backup Storage Sizing

ComponentTypical SizeGrowth Rate30-day Estimate
MariaDB full500MB-2GB~50MB/day~3.5GB
MariaDB incremental50-200MBPer day~4.2GB
Kolla configs10-50MBStable~50MB
Fernet keys<1MBStable<1MB
Volume snapshotsVaries by usagePer volumePlan per tenant
Glance images1-10GB per imagePer uploadPlan per image

Operate

Scheduled Backup Verification

Restore test procedure (monthly):

# 1. Create isolated test database container
docker run -d --name mariadb_restore_test \
  -e MYSQL_ROOT_PASSWORD=testpass \
  mariadb:latest

# 2. Restore latest backup into test container
docker exec mariadb_restore_test mariabackup --prepare \
  --target-dir=/backup/latest_full

docker exec mariadb_restore_test mariabackup --copy-back \
  --target-dir=/backup/latest_full

# 3. Verify data integrity
docker exec mariadb_restore_test mysql -u root -ptestpass \
  -e "SELECT COUNT(*) FROM keystone.project; SELECT COUNT(*) FROM nova.instances;"

# 4. Clean up
docker rm -f mariadb_restore_test

Disaster Recovery Drill Procedures

Full restore from backup (complete cloud rebuild):

  1. Redeploy infrastructure:
    kolla-ansible -i inventory bootstrap-servers
  2. Restore MariaDB from latest full + incremental backups
  3. Restore Kolla configuration files from config backup
  4. Deploy services:
    kolla-ansible -i inventory deploy
  5. Restore Fernet keys (critical for existing tokens)
  6. Verify all services:
    openstack token issue
    ,
    openstack service list
  7. Restore volume snapshots as needed
  8. Verify user data access

Partial service restore (single service rebuild):

# Example: rebuild Nova after database corruption
# 1. Restore Nova database
docker exec -i mariadb mysql -u root -p nova < nova_backup.sql

# 2. Redeploy Nova only
kolla-ansible -i inventory deploy --tags nova

# 3. Verify
openstack server list  # Should show existing instances
openstack compute service list  # All services should be up

Database point-in-time recovery:

# 1. Prepare full backup
mariabackup --prepare --target-dir=/backup/full_base

# 2. Apply incrementals in order
mariabackup --prepare --target-dir=/backup/full_base \
  --incremental-dir=/backup/inc_001
mariabackup --prepare --target-dir=/backup/full_base \
  --incremental-dir=/backup/inc_002

# 3. Copy back to data directory
mariabackup --copy-back --target-dir=/backup/full_base

Backup Catalog Maintenance

Maintain a catalog of all backups with metadata:

# Append to backup catalog after each successful backup
echo "$(date -u +%Y-%m-%dT%H:%M:%SZ),database,full,${BACKUP_FILE},$(du -sh ${BACKUP_FILE} | cut -f1),$(sha256sum ${BACKUP_FILE} | cut -d' ' -f1)" \
  >> /opt/openstack-backups/catalog.csv

Catalog format:

timestamp,type,level,path,size,checksum

Backup Rotation Execution

#!/bin/bash
# rotate-backups.sh -- enforce retention policies
DAILY_KEEP=7
WEEKLY_KEEP=28
MONTHLY_KEEP=365

# Remove old daily backups
find /opt/openstack-backups/database/daily -mtime +${DAILY_KEEP} -delete -print >> /opt/openstack-backups/logs/rotation.log

# Remove old weekly backups
find /opt/openstack-backups/database/weekly -mtime +${WEEKLY_KEEP} -delete -print >> /opt/openstack-backups/logs/rotation.log

# Update catalog
grep -v "DELETED" /opt/openstack-backups/catalog.csv > /opt/openstack-backups/catalog.csv.tmp
mv /opt/openstack-backups/catalog.csv.tmp /opt/openstack-backups/catalog.csv

echo "Rotation completed: $(date)" >> /opt/openstack-backups/logs/rotation.log

Troubleshoot

1. MariaDB Backup Failure

Symptoms:

kolla-ansible mariadb_backup
exits non-zero. No backup file created.

CauseDiagnosisFix
Disk full
df -h /var/lib/docker/volumes/mariadb/
shows >95% used
Free space: purge old backups, extend volume
Lock timeoutBackup log:
Lock wait timeout exceeded
Retry during low-activity period; increase
innodb_lock_wait_timeout
Connection refusedBackup log:
Can't connect to local MySQL server
Verify MariaDB running:
docker ps --filter name=mariadb
; restart if needed
Insufficient privilegesBackup log:
Access denied
Verify backup user has RELOAD, LOCK TABLES, PROCESS, REPLICATION CLIENT grants

2. Volume Snapshot Stuck in "creating" State

Symptoms:

openstack volume snapshot list
shows status "creating" indefinitely.

CauseDiagnosisFix
Cinder backend issue
docker logs cinder_volume 2>&1 | tail -50
Check LVM:
lvs
for snapshot overflow; extend VG if needed
LVM snapshot overflow
lvs -o lv_name,snap_percent
shows >100%
Delete overflow snapshot:
openstack volume snapshot delete <id> --force
; increase snapshot reserve
iSCSI target locked
iscsiadm -m session
shows stale sessions
Clear stale sessions; restart target service
Cinder volume service down
openstack volume service list
shows disabled
Enable service:
openstack volume service set --enable <host> cinder-volume

3. Restore Fails

Symptoms: Backup restoration exits with errors. Database or service does not come up cleanly.

CauseDiagnosisFix
Incompatible backup versionmariabackup:
This backup is from a different version
Restore using matching mariabackup version; or use mysqldump (logical backup)
Corrupted backupmariabackup --prepare fails with checksum errorUse previous backup; verify checksums before relying on backups
Missing encryption keygpg:
decryption failed: No secret key
Locate key file; restore from secure key backup location
Filesystem permission mismatchService fails to start after restore:
Permission denied
chown -R mysql:mysql /var/lib/mysql/
; verify container user mapping

4. Backup Storage Full

Symptoms: Backup scripts fail with "No space left on device". Retention not enforced.

CauseDiagnosisFix
Retention not enforcedOld backups still present:
find /opt/openstack-backups -mtime +30 -ls
Run rotation script manually; fix cron schedule
Unexpected growthSingle large backup:
du -sh /opt/openstack-backups/*
Investigate cause (large DB tables, uncompressed backups); add compression
NFS mount disconnected
df -h /mnt/backup-nfs/
shows local mount
Remount NFS; check network connectivity to NFS server

5. Incremental Backup Chain Broken

Symptoms: Cannot restore because intermediate incremental backup is missing or corrupted.

CauseDiagnosisFix
Missing parent backup
mariabackup --prepare
fails:
missing incremental base
Start new full backup; rebuild incremental chain from new base
Backup file corruptedChecksum mismatch in catalogUse last known-good full backup; accept data loss to RPO of that backup
Catalog out of syncBackup files exist but catalog references wrong pathsRebuild catalog from filesystem: scan backup directories and regenerate

Integration Points

Backup connects to every data-bearing component of the cloud stack:

Core OpenStack services: Backs up databases and configuration files for all 8 services. Each service's database (keystone, nova, neutron, cinder, glance, swift, heat, horizon) is included in the MariaDB backup set. Service-specific configs are backed up from

/etc/kolla/<service>/
.

Monitoring skill: Backup job success/failure metrics feed Prometheus alerting. Monitoring tracks backup duration, storage consumption, and schedule adherence. Alert on backup failures or missed backup windows.

Security skill: Backup encryption protects sensitive data at rest (passwords.yml, Fernet keys, database dumps). Credential backup requires secure storage with access controls. Security audit includes backup access review.

Capacity skill: Backup storage consumption is included in capacity planning models. Growth rate of backup storage drives storage provisioning decisions.

Kolla-ansible-ops skill: Backup is a prerequisite before upgrade and reconfigure operations. The upgrade procedure mandates a verified backup before any destructive changes.

kolla-ansible mariadb_backup
is the first step in upgrade runbooks.

NASA SE Cross-References

SE PhaseBackup ActivityReference
Phase B (Preliminary Design)Define backup requirements: RPO/RTO targets, storage requirements, retention policiesSP-6105 SS 4.3-4.4
Phase E (Operations & Sustainment)Execute backup schedules, verify restore procedures, maintain backup catalog, conduct DR drillsSP-6105 SS 5.4-5.5
Phase E (Technical Data Management)Backup as data management activity: ensure all operational data is recoverable per retention policySP-6105 SS 6.5
Phase F (Closeout)Data archive procedures: export and archive all cloud data before decommission, verify archive integritySP-6105 SS 6.1