FISH
I do a good bit of hardware integration with the web, with manufacturing equipment, with embedded systems and with big data set, or that can sustain multiple failures. Not necessarily all at once, but typically, people expect FISH from me :)
FISH is Fully Integrated Software and Hardware (btw, as a side note, the internal project at Sun to create appliances based on ZFS was known as FISHWorks). The Raspberry Pi is a cool piece of hardware, but I typically need stuff that is only (or mostly) found on Solaris and derived OSes, such as ZFS. I've been using ZFS for many years now, since the first public release on Solaris Nevada. ZFS scales and give you data integrity. And it can run on the largest systems known to man.
It scales
For example, I'm listening right now to ZFS Day's live video stream and hearing a talk about ZFS on the Sequoia supercomputer, which is the fastest supercomputer out there. They are using it as a native port, not using FUSE.
What is ZFS?
Wikipedia: "ZFS is a combined file system and logical volume manager designed by Sun Microsystems. The features of ZFS include data integrity verification against data corruption modes, support for high storage capacities, integration of the concepts of filesystem and volume management, snapshots and copy-on-write clones, continuous integrity checking and automatic repair, RAID-Z and native NFSv4 ACLs. ZFS is implemented as open-source software, licensed under the Common Development and Distribution License (CDDL)."
From Supercomputers to $35 computers
So, ZFS scales at the highest level obviously. Well, it also scales down: I've been using a bit ZFS on the Raspberry Pi using FUSE, until I can get a Solaris derived OS ( such as illumos, smartos, openindiana, opensolaris etc) on the Raspberry Pi. That way, at least I have ZFS. Still missing zones, smf and dtrace, but it is a start.
Now just a reminder, the Pi only has 256MB total ram, and a BCM arm processor. So first thing first, we need to give as much ram to the OS as possible, and reduce the video buffer size:
I'm using a 240MB split on that Raspberry Pi since it is running only in text mode at the console, and I remote to it using ssh -X.
If you use the composite out you might want to use the 224MB split and definitely 192 or 128 using HDMI, but then at that point, you are chocking ZFS. That's 128 for OS and ZFS and whatever apps you are running...
Fully loaded
Altough Raspbian comes with a good amount of stuff preloaded, it was not intended to be used with FUSE out of the box, and ZFS was probably never on the radar screen of anybody. So let's start with adding the FUSE stuff and the libraries and tools we will need to build ZFS. This is the shortlist:
fdion@raspberrypi ~/zfs $ sudo apt-get install fuse-utils libfuse-dev libfuse2
fdion@raspberrypi ~/zfs $ sudo apt-get install libaio-dev libattr1-dev attr
fdion@raspberrypi ~/zfs $ sudo apt-get install git scons
If you build it...
So we have the prerequisites. Let's get the code, compile it and install the tools:
fdion@raspberrypi ~ $ mkdir zfs
fdion@raspberrypi ~ $ cd zfs
fdion@raspberrypi ~/zfs $ git clone https://bitbucket.org/cli/zfs-fuse-arm.git
fdion@raspberrypi ~/zfs $ cd zfs-fuse-arm/
fdion@raspberrypi ~/zfs/zfs-fuse-arm $ cd src
fdion@raspberrypi ~/zfs/zfs-fuse-arm/src $ scons
[a lot of stuff will scroll by]
fdion@raspberrypi ~/zfs/zfs-fuse-arm/src $ sudo scons install
[again, more stuff will scroll by]
Wow, it compiled (scons). And installed (sudo scons install). It's a good thing we are using the zfs-fuse-arm version, because the mainline wont go very far on the compile.
A demonstration, if you please?
Well of course! Let's start the zfs-fuse daemon and create two virtual disks. I'm creating two 100M disks here using dd/ (this is on a slow SD card, rated 10MB/s). You could also use an actual /dev (like a pair of USB keys):
fdion@raspberrypi ~/zfs/zfs-fuse-arm/src/zfs-fuse $ sudo sh run.sh &
fdion@raspberrypi ~/zfs/zfs-fuse-arm/src/zfs-fuse $ cd
fdion@raspberrypi ~ $ cd zfs
fdion@raspberrypi ~/zfs $ mkdir test
fdion@raspberrypi ~/zfs $ cd test
fdion@raspberrypi ~/zfs/test $ dd if=/dev/zero of=fakedisk1 bs=1024k count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 10.2747 s, 10.2 MB/s
fdion@raspberrypi ~/zfs/test $ dd if=/dev/zero of=fakedisk2 bs=1024k count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 10.7517 s, 9.8 MB/s
Up to now we haven't done anything with ZFS per say. And basically to mirror two drives in ZFS and create a new storage out of that, all we have to do:
fdion@raspberrypi ~/zfs/test $ sudo zpool create mymirror mirror /home/fdion/zfs/test/fakedisk1 /home/fdion/zfs/test/fakedisk2
Now let's create a filesystem on that new zpool device, and mount it to a local folder in my home directory, change permissions so I can write to it and finally copy some files from /etc to my new filesystem:
fdion@raspberrypi ~/zfs/test $ cd
fdion@raspberrypi ~ $ mkdir myfilesystem
fdion@raspberrypi ~ $ sudo zfs create mymirror/myfilesystem -o mountpoint=/home/fdion/myfilesystem
fdion@raspberrypi ~ $ sudo chown fdion:pi myfilesystem/
fdion@raspberrypi ~/myfilesystem $ cp /etc/*.conf .
cp: cannot open `/etc/fuse.conf' for reading: Permission denied
fdion@raspberrypi ~/myfilesystem $ ls
adduser.conf gssapi_mech.conf libaudit.conf pnm2ppa.conf
asound.conf hdparm.conf logrotate.conf resolv.conf
ca-certificates.conf host.conf mke2fs.conf rsyslog.conf
colord.conf idmapd.conf mtools.conf sensors3.conf
debconf.conf insserv.conf nsswitch.conf sysctl.conf
deluser.conf ld.so.conf ntp.conf ts.conf
gai.conf libao.conf pam.conf ucf.conf
fdion@raspberrypi ~/myfilesystem $ sudo zfs list
NAME USED AVAIL REFER MOUNTPOINT
mymirror 191K 63.3M 22K /mymirror
mymirror/myfilesystem 89.5K 63.3M 89.5K /home/fdion/myfilesystem
fdion@raspberrypi ~/myfilesystem $ sudo zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
mymirror 95.5M 196K 95.3M 0% 1.00x ONLINE -
fdion@raspberrypi ~/myfilesystem $
How cool is that? I now have a mirrored backup of my .conf files. Well, not quite. We are using fake disks, so if the SD card dies I loose all.
So next time we'll demo with actual USB drives.
18 comments:
Mirroring works fine, and if a device fails, it is handled properly, however zpool status doesn't reflect the reality. I hadn't tested that part yet, so I'll have to dig in the code.
No failure:
pi@raspberrypi ~ $ sudo zpool status -v
pool: mymirror
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
mymirror ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
sda1 ONLINE 0 0 0
sdb1 ONLINE 0 0 0
errors: No known data errors
I then pulled the second usb device:
pi@raspberrypi ~ $ ls /dev/sd*
/dev/sda /dev/sda1
sdb and sdb1 are gone, but:
pi@raspberrypi ~ $ sudo zpool status -v
pool: mymirror
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
mymirror ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
sda1 ONLINE 0 0 0
sdb1 ONLINE 0 0 0
errors: No known data errors
pi@raspberrypi ~ $ sudo zfs list
NAME USED AVAIL REFER MOUNTPOINT
mymirror 55.4M 7.27G 55.3M /mymirror
Before doing this I was accessing a file in a loop, it is still looping. So the zfs side of things is working, just not the notification to the status.
What's the chance that the file data was in the arc cache hence zfs didn't notice the disk vanish. Try a write, that'll force IO to the disk which should cause the failure notification.
Writing didn't do it, but something else did:
fdion@raspberrypi ~ $ sudo zpool status
pool: mymirror
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
mymirror ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
disk/by-id/usb-SanDisk_Cruzer_Edge_20054054620F3CC11EC2-0:0-part1 ONLINE 0 0 0
disk/by-id/usb-SanDisk_Cruzer_Edge_20052845410F3CC16219-0:0-part1 ONLINE 0 0 0
errors: No known data errors
I then pulled the plug on one device (the reason they are not sda1 and sdb1 is that I shutdown the Pi to bring it to CHS last night - and I had to do a zpool export mymirror and zpool import mymirror for it to mount again):
fdion@raspberrypi ~ $ sudo zpool status
pool: mymirror
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
mymirror ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
disk/by-id/usb-SanDisk_Cruzer_Edge_20054054620F3CC11EC2-0:0-part1 ONLINE 0 0 0
disk/by-id/usb-SanDisk_Cruzer_Edge_20052845410F3CC16219-0:0-part1 ONLINE 0 0 0
errors: No known data errors
Still showing online, although clearly it is not since I pulled it. Let's try some writes
fdion@raspberrypi ~ $ sudo cp -r hardware_projects /mymirror/
fdion@raspberrypi ~ $ sudo zpool status
pool: mymirror
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
mymirror ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
disk/by-id/usb-SanDisk_Cruzer_Edge_20054054620F3CC11EC2-0:0-part1 ONLINE 0 0 0
disk/by-id/usb-SanDisk_Cruzer_Edge_20052845410F3CC16219-0:0-part1 ONLINE 0 0 0
errors: No known data errors
Alright, we got to get this to trigger. Let's do a scrub.
fdion@raspberrypi ~ $ sudo zpool scrub mymirror
fdion@raspberrypi ~ $ sudo zpool status
pool: mymirror
state: DEGRADED
status: One or more devices could not be opened. Sufficient replicas exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
see: http://www.sun.com/msg/ZFS-8000-2Q
scrub: scrub completed after 0h0m with 0 errors on Thu Oct 4 10:45:16 2012
config:
NAME STATE READ WRITE CKSUM
mymirror DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
disk/by-id/usb-SanDisk_Cruzer_Edge_20054054620F3CC11EC2-0:0-part1 ONLINE 0 0 0
disk/by-id/usb-SanDisk_Cruzer_Edge_20052845410F3CC16219-0:0-part1 UNAVAIL 0 76 0 cannot open
errors: No known data errors
So that works, but normally on native zfs that works right away without scrub. I dont want to have to schedule scrubs every 5 minutes :)
I'm thinking there might be an unimplemented message. I'll have to look at the code when I have a few minutes.
It takes a while for zfs to give up on IO to devices - it can be 5 minutes or more before zfs offlines the device. If you generate some IO to the device then wait it should eventually notice the device is gone. Shortening device timeouts will help here.
I am far from expert in this field and I had some issues however I managed to resolve them - building from scratch I had some additional steps:
install Raspbian
http://www.raspbian.org/RaspbianInstaller
install gcc:
sudo apt-get install git gcc build-essential libsdl1.2-dev
install openssl library and headers:
sudo apt-get install libssl-dev
then follow this post.
Thanks for the great info!
Have you tried running freebsd on your pi? Freebsd has native zfs support so it may be easier than having to install fuse and zfs separately
Yes, I do have a freebsd Pi too. My long term goal though is to run IllumOS on the Pi.
See: http://solarisdesktop.blogspot.com/2013/02/illumos-on-raspberrypi.html
Hello. I have an external hard drive that is formatted in NTFS file system and I would like to have it in FAT32 so I can use it on my ps3 system. Thing is that the drive is half full with data and my main drive isn't big enough to copy, reformat and copy back. Is there any software that could change the file system without deleting the data from the drive? Thank you for your answer.
phlebotomy training in nevada
I was hoping someone had done this when i found this blog... def gonna check that out!
When running scons, is this something to worry about ?
"scons: warning: BuildDir() and the build_dir keyword have been deprecated;
use VariantDir() and the variant_dir keyword instead.
File "/root/zfs/zfs-fuse-arm/src/lib/libzpool/SConscript", line 4, in "
Also saw this;
"lib/libzpool/vdev.c: In function 'vdev_open_children':
lib/libzpool/vdev.c:1085:3: warning: comparison between pointer and integer [enabled by default]"
zfs-fuse/zfs_ioctl.c: In function 'zfs_ioc_set_prop':
zfs-fuse/zfs_ioctl.c:2292:24: warning: comparison between pointer and integer [enabled by default]
This is nice. Thanks for sharing.
mass notifications
Hi, I believe we have native ZFS support in linux now. Did you try it ?
Just seen this blog because I was google-ing Raspberry Pi and ZFS.
I'm going to give native ZFS a try on a raspberry pi this weekend using Gentoo.
My laptop already runs root on ZFS with Gentoo and the ability to jump back to snapshots instantly is a god send when an emerge goes bad.
Just got my first R-PI today and cant wait to try it out, I will use my laptop to cross compile over distcc though because I think I could wait a long time for the R-PI to compile just the kernel let alone everything else.
Hi,
Congratulations for these tuto.
Is possible to create only a FS in zfs without mirror?
Thanks
Paragraphs 1 & 2
Fish, it scales.
well played sir, well played :D
By the way: ZoL compiles for Raspbian just fine. It'll most likely perform better than zfs-fuse thanks to its kernel module, which is pretty important in a pi.
Post a Comment