Using SnapRAID safely

First written 2026-01-29; Last updated 2026-01-29

Note	TL;DR: Deleting any file may prevent the recovery of your data until the next `snapraid sync`. Either move them outside the SnapRAID path until the sync or run SnapRAID on a snapshot and keep the last snapshot until after the next sync.

SnapRAID is an alternative approach to traditional RAID. It provides redundant data storage using parity information to guard against disk failure and silent corruption. Unlike traditional RAID, it works on regular file systems instead of disks/block devices. This makes it quite flexible regarding disk sizes, parity count, etc. It also keeps the files accessible even when not using SnapRAID. Also unlike RAID the parity is not calculated live but requires a manual snapraid sync.

As the parity is calculated over all files, deleting a file effectively destroys some parity information. This can prevent the recovery of all other files depending on the same parity.

Replicate the problem

First, initialize SnapRAID and some test data:

cd /tmp

# Initialize disks
rm -f disk0; truncate -s 1G disk0; mkfs.ext4 disk0
rm -f disk1; truncate -s 1G disk1; mkfs.ext4 disk1
rm -f disk2; truncate -s 1G disk2; mkfs.ext4 disk2
rm -rf mnt0 mnt1 mnt2; mkdir -p mnt0 mnt1 mnt2
mount disk0 mnt0; mount disk1 mnt1; mount disk2 mnt2

# Setup SnapRAID
cat > /etc/snapraid.conf <<EOF
parity /tmp/mnt0/parity
data mnt1 /tmp/mnt1/data
data mnt2 /tmp/mnt2/data
content /tmp/mnt1/content
content /tmp/mnt2/content
EOF
mkdir -p mnt1/data mnt2/data

# Write data
dd if=/dev/urandom bs=1M count=1 of=mnt1/data/file1
dd if=/dev/urandom bs=1M count=1 of=mnt1/data/file2
dd if=/dev/urandom bs=1M count=1 of=mnt2/data/file3
dd if=/dev/urandom bs=1M count=1 of=mnt2/data/file4

snapraid sync

This creates the following directory structure:

/tmp
├── mnt0
│   └── parity
├── mnt1
│   ├── content
│   ├── content.lock
│   └── data
│       ├── file1
│       └── file2
└── mnt2
    ├── content
    └── data
        ├── file3
        └── file4

Now, simulating a failure of one disk by removing file1 and file2. Running snapraid fix successfully restores them without issues as expected. Removing file1 and file4 also permits restoring them because they don’t share the same parity. However, removing file1 and file3 prevents the restore:

[...]
unrecoverable file1
unrecoverable file3
100% completed, 3 MB accessed in 0:00

       8 errors
       0 recovered errors
       4 UNRECOVERABLE errors
DANGER! Unrecoverable errors detected!

This means that removing any file is potentially unsafe when other errors (like a broken disk) can occur. The exact effect depends on the configuration and number of parities. This behavior can be especially unexpected with mergerfs. A typical setup is to configure mergerfs over multiple directories on different drives and then also to protect those directories with SnapRAID. However, the default allocation strategy ("percentage free random distribution") can lead to a situation where deleting a single directory can then delete files on multiple disks causing the mentioned problem.

This was also mentioned in articles like https://www.reddit.com/r/Snapraid/comments/1clk039/snapraid_not_safe/ where the user was only able to restore 70% of their files after accidentally removing a directory. The article also points to the explanation https://sourceforge.net/p/snapraid/discussion/1677233/thread/8282fcf886/?limit=25#b71e/a288/884a/1d58 (same results as this article).

Workarounds

The issue is basically a "race condition" between deleting the files and running snapraid sync. The workaround is to make these two steps "atomic".

Delay delete

One possible workaround is to move the files out of the SnapRAID paths instead of deleting them. In case of an error, these old files can be passed to snapraid fix using the --import option to permit recovery. On caveat is that --import can only be given once, so you must move/copy all files together (or use something like mergerfs). After the next snapraid sync the files can be deleted.

Snapshots

When that is unfeasible, file system snapshots are another approach. The snapshots must be kept until after the next snapraid sync. The snapshots are not required for the parity files, only for the data. As example, Btrfs snapshots are used in the following setup:

cd /tmp

# Initialize disks (btrfs this time, except for parity which need no snapshots)
rm -f disk0; truncate -s 1G disk0; mkfs.ext4 disk0
rm -f disk1; truncate -s 1G disk1; mkfs.btrfs disk1
rm -f disk2; truncate -s 1G disk2; mkfs.btrfs disk2
rm -rf mnt0 mnt1 mnt2; mkdir -p mnt0 mnt1 mnt2
mount disk0 mnt0; mount disk1 mnt1; mount disk2 mnt2

# Setup SnapRAID (new paths!)
cat > /etc/snapraid.conf <<EOF
parity /tmp/mnt0/parity
data mnt1 /tmp/mnt1/snap
data mnt2 /tmp/mnt2/snap
content /tmp/mnt1/content
content /tmp/mnt2/content
EOF

# btrfs can only create snapshots for subvolumes
btrfs subvolume create /tmp/mnt1/data
btrfs subvolume create /tmp/mnt2/data

# Write data
dd if=/dev/urandom bs=1M count=1 of=mnt1/data/file1
dd if=/dev/urandom bs=1M count=1 of=mnt1/data/file2
dd if=/dev/urandom bs=1M count=1 of=mnt2/data/file3
dd if=/dev/urandom bs=1M count=1 of=mnt2/data/file4

Note that SnapRAID is running on the snapshot directories snap, not the live data! All regular changes must be performed only in the data directories, not in the snapshots.

With this setup the following script can be used to safely use SnapRAID. Before the first run you need to manually create the snapshots with for i in 1 2; do btrfs subvolume snapshot /tmp/mnt$i/data /tmp/mnt$i/snap; done.

set -eu

for i in 1 2; do
    x="/tmp/mnt$i"
    # Make sure we have a mounted file system here as safeguard
    if ! findmnt "$x" > /dev/null; then
        echo "$x is not mounted" >&2
        exit 1
    fi

    cur="$x/snap"
    old="$x/snap.old"
    if test -e "$old"; then
        echo "$old already exists!" >&2
        exit 1
    fi
    mv "$cur" "$old"
    btrfs subvolume snapshot "$x/data" "$cur"
done

snapraid sync

for i in 1 2; do
    btrfs subvolume delete "/tmp/mnt$i/snap.old"
done

Running the setup and sync scripts yields the following directory structure:

/tmp
├── mnt0
│   └── parity
├── mnt1
│   ├── content
│   ├── content.lock
│   ├── data
│   │   ├── file1
│   │   └── file2
│   └── snap
│       ├── file1
│       └── file2
└── mnt2
    ├── content
    ├── data
    │   ├── file3
    │   └── file4
    └── snap
        ├── file3
        └── file4

Due to the snapshots, any deleted files won’t affect SnapRAID. So loss of a disk while files were manually deleted won’t cause a data loss.

Currently, snapraid sync shows the following warning when running on a Btrfs snapshot:

WARNING! UUID is unsupported for disks: 'mnt1', 'mnt2'. Not using inodes to detect move operations.

This is only a warning and does not affect the ability to recover data. However, renames are handled less efficiently.

An alternative to Btrfs are LVM snaphots.

Results

The described setup (snaphots using btrfs) was tested on real-world data (40TiB over 1e6 files) including (simulated) loss of a drive among other things. It works without issues so far.

Note that SnapRAID does not restore permissions/user/groups. These must be tracked separately if necessary. SnapRAID also does not preserve sparse files when restoring.

To present the SnapRAID directories as a single directory, mergerfs can be used with the following configuration in /etc/fstab:

/tmp/mnt1/data:/tmp/mnt2/data /tmp/mnt mergerfs category.create=pfrd,func.getattr=newest,passthrough.io=rw,cache.files=auto-full,moveonenospc=false,fsname=mergerfs-mnt 0 0

Also see the mergerfs documentation for details:

category.*, func.*: important defaults, set explicitly
passthrough.io: better IO performance (needs root)
cache.files, moveonenospc: prerequisites for passthrough
fsname: shorter name in mount/findmnt/etc.

Resulting in:

/tmp/mnt
├── file1
├── file2
├── file3
└── file4

back

Last updated 2026-01-29