Post

Ransomware Recovery Lab: Immutable Backups and Restore Drills

Ransomware Recovery Lab: Immutable Backups and Restore Drills

Ransomware response is mostly about recovery. If your backups are slow, mutable, or untested, you do not have a recovery plan. A home lab is the perfect place to practice immutable backups, snapshot isolation, and restore drills without real risk.

This post describes a practical ransomware recovery lab using ZFS snapshots and restic. The focus is on immutability, testing, and verification, which are the only real defenses once encryption happens.

Design goals

A good recovery lab should provide:

  • Immutable or append-only backups
  • Multiple restore points (RPO)
  • A repeatable restore process (RTO)
  • Verification that backups are usable

You can simulate ransomware by encrypting a test directory and then restoring it from backups.

ZFS snapshots

If you use ZFS in your lab, snapshots are fast and space-efficient. Create a dataset for lab data and schedule snapshots.

1
2
sudo zfs create tank/labdata
sudo zfs snapshot tank/labdata@daily-2025-12-03

Snapshots are read-only by default. For extra safety, keep them on a separate pool or replicate them to a backup host.

Restic with append-only repo

Restic supports append-only repositories, which are resistant to ransomware if the attacker cannot change the repo configuration. Create a repo on a separate disk or NAS:

1
2
3
export RESTIC_REPOSITORY=/mnt/backup/restic
export RESTIC_PASSWORD='strong-passphrase'
restic init

Then back up your lab data:

1
restic backup /srv/labdata

Set the repo to append-only by adjusting permissions or mounting it read-only for the backup user.

Backup integrity and verification

Backups can fail silently. Restic includes a check command that verifies repository integrity. Run it on a schedule and alert if it fails. This is especially important if your backup target is a consumer NAS or external disk that may develop errors.

1
restic check --read-data-subset=5%

The read subset option keeps the check fast while still sampling real data.

Air-gapped and offline copies

Immutable does not always mean offline. If an attacker gains admin access, they may still delete or encrypt backups. Keep at least one copy offline, such as an external drive that is only connected during backup windows. In a lab, even a weekly offline copy is a useful habit.

If you use ZFS replication, replicate to a host that is not domain-joined and does not share credentials with your primary environment. This reduces the chance that ransomware on a Windows host can reach your backup target.

Restore drills

A backup that has not been restored is not a backup. Run a weekly drill that restores a random snapshot to a test directory and compares checksums.

1
2
3
mkdir -p /tmp/restore-test
restic restore latest --target /tmp/restore-test
sha256sum /srv/labdata/critical.db /tmp/restore-test/srv/labdata/critical.db

If the checksums match, your restore pipeline is healthy.

RTO and RPO measurement

Define your Recovery Time Objective (RTO) and Recovery Point Objective (RPO). In a lab, measure how long it takes to restore a dataset and how much data you would lose between backups. If it takes two hours to restore and your RTO is 30 minutes, your plan is not realistic.

Record these metrics and track them over time. As your lab grows, restore times will increase. Measuring now helps you make informed decisions about backup frequency and storage choices.

Backup infrastructure and secrets

Data files are not the only thing you need. Back up configuration files, infrastructure as code, and secrets storage. If you rebuild a server but do not have its configuration, you will lose more time than the data itself.

In a lab, keep a separate backup of /etc, your automation playbooks, and any password vault exports. Store these in the same immutable backup system so they are protected during a ransomware event.

Immutable object storage

If you have access to object storage that supports versioning and object lock, use it. Services like MinIO can provide this in a homelab. Enable versioning and set retention policies so that even if credentials are compromised, old versions remain accessible.

Test this by uploading a file, overwriting it, and then restoring the previous version. This gives you confidence that object lock behavior works before you rely on it during a real incident.

Runbooks and communication

Recovery is a process, not a single command. Write a short runbook that lists the restore order, credentials, and validation steps. In a lab, practice the runbook during a simulated incident so you can measure how long it takes to execute each step.

Also plan communication. Even in a small environment, you need to know who is responsible for restoring services and who validates the data. Clear ownership reduces confusion when time is tight.

Simulating ransomware

In a controlled lab, you can simulate ransomware by encrypting files with openssl or a test script. The idea is to validate detection and recovery workflows.

1
2
find /srv/labdata -type f -name '*.txt' -print0 | \
  xargs -0 -I{} openssl enc -aes-256-cbc -salt -in {} -out {}.enc -pass pass:lab

Now attempt to restore from a snapshot or restic and confirm that the original files return intact.

Monitoring and alerting

Use file integrity monitoring or simple hash checks to detect sudden mass changes. If you have Wazuh or another SIEM, create an alert for a large number of file modifications in a short time window.

This is where a lab SIEM is useful. You can test how quickly your alerting triggers and whether it is too noisy.

Lab checklist

Use this checklist to validate your recovery posture:

  • Run a restore drill and record the total time (RTO).
  • Verify snapshots or restic backups are immutable or append-only.
  • Restore a random file and confirm checksums match.
  • Test alerting for mass file changes and confirm it triggers.

Operational tips

  • Keep at least one offline or write-once copy.
  • Do not store backup credentials on the same host as the primary data.
  • Automate snapshot and backup schedules with cron or systemd timers.
  • Document the full restore procedure and keep it accessible offline.

Takeaways

Ransomware defense is about recovery under pressure. Immutable snapshots and append-only backups give you a fighting chance, but only if you practice restores. Build the habit in your lab now, and you will be far more resilient when it matters.

This post is licensed under CC BY 4.0 by the author.