Zen Bastard (jimbojones) wrote,
Zen Bastard

ZFS data healing under FreeBSD 7.2-RELEASE

I decided to do a quick test of ZFS's automatic data healing and corruption protection today.

With the same 5-drive RAIDZ setup under FreeBSD 7.2-RELEASE as in the earlier benchmarks, I first created a text file containing the sentence "This is a test of data corruption in live filesystems." I saved that text file to the root of the RAIDZ, then unmounted the pool, then kldunloaded the ZFS module from FreeBSD's kernel.

Next, I used a small Perl script to look through /dev/ad8 - one of the five physical drives in the array - to find the sentence above. Having found it, I then did a raw write to /dev/ad8 changing "This" to "Tgjs". Presto, data corruption!

Now, I re-kldloaded zfs.ko, then re-mounted the pool, and did a quick cat test.txt:

# cat /backup/test.txt
This is a test of data corruption in live filesystems.

Excellent - ZFS did in fact catch and heal the corruption I introduced - and if I check the status of the pool, it will warn me about it:

# zpool status backup
pool: backup
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-9P
scrub: none requested

        NAME         STATE      READ WRITE CKSUM
        backup      ONLINE        0     0     0
          raidz1    ONLINE        0     0     0
            ad6     ONLINE        0     0     0
            ad8     ONLINE        0     0     1
            ad10    ONLINE        0     0     0
            ad12    ONLINE        0     0     0
            ad14    ONLINE        0     0     0

errors: No known data errors

That is pretty freaking awesome.

One nasty caveat: due to an inadvisable configuration line in /etc/devd.conf, by default ZFS errors will not be logged in any system log - if you haven't specifically changed logging configuration on your machine, ZFS errors end up going to user.warn which, again by default, effectively means going to /dev/null. See here for more info on the logging SNAFU.
Tags: alpha geek

  • I haz a son. I haz a son... I HAZ A SON!

    Jacob Ruffin Salter, aka "Finn", was born at 0530 on 10/11/10, weighing in at a whopping 10lbs 9oz and 20.5" long. The two most frequent comments…

  • only two ways to stand out

    Her: Did you see [particular person in crowded area]? Me: Doesn't ring a bell. Her: I'm talking about the one [add details]. Me: Sorry, don't think…

  • Jane pic du jour

    Going back through the metric ton of Jane pics in my camera, this is one of my favorites. She was just under 4 months old when it was taken.…

  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your IP address will be recorded