Any plans on adding ZFS to the installer?
Otheres more qualified to answer may be along - but I would doubt it myself. There are a LOT of options when setting it up, and setting a ‘standard’ for an Arch-based system is a bit of a stretch! It is easy enough to do if you know how it should work - a couple of commands to load a module, pre-partition, and fire the installer at the result…
Oh - and I would recommend rEFInd to boot with, rather than grub.
Calamares does not include it, if the option will be added to the installer we can provide it
To do this downstream would need a lot of coding and changing the partition module on calamares, we do not have anyone at development with knowledge and time to do so.
https://github.com/pop-os/installer not calamares
And is based on Debian… very different
ubuntu got openzfs thats another story as long you dont have te module in all linux system calamaris will not support it ootb
From a look through the Archwiki, it appears to be possible by presetting what you want before actually running Calamares to do the install, but I haven’t tried it! YMMV.
I wonder how hard it would be to switch Manjaro back to purer Arch after an install? Have fun, however you decide to go…
But Calamares has no issue with proprietary Nvidia modules or other closed source firmware packages? Looks like the FUD campaign from some kernel developers is paying off. Congratulations!
Currently on linux base btrfs get more and better tested then zfs. At the end your better off with btfs.
KPMcore, the KDE Partition Manager core, is a library for examining and modifying partitions, disk devices, and filesystems on a Linux system. It provides a unified programming interface over top of (external) system-manipulation tools.
Always check the dates on such articles:
by Rudd-O — last modified 6 years ago
Development on both filesystems did progress since that time.
Unless you want any of the numerous features present in zfs but not present in btrfs or you need more file system stability than what btrfs can offer.
I could only identify one item in the article which is not valid anymore: In the meantime btrfs supports send/receive.
Other than that I believe the described differences still exist.
And what I find remarkable is the missing RAID5 stability. The article is 6 years old and the RAID5 in btrfs is still not fully reliable. This is honestly disappointing.
missing real benchmarks between
raid10 from zfs and others ( xfs or ext4 )
In most use cases, filesystems like zfs/btrfs are going to have lower raw read/write performance than simple filesystems like xfs/ext4. The reason for using those filesystems isn’t performance, it is access to all the advanced functionality they offer.
If it would only be about performance…
zfs and btrfs do full checksums about all data and metadata which is a strong feature with regard to data integrity.
And btrfs/zfs are the only filesystem which include RAID and lvm functionality. Although btrfs falls a little short in this aspect.
So if you want to do a performance benchmark with XFS or ext4 you need to benchmark mdadm+lvm+xfs/ext4 with zfs/btrfs. And although I have not seen such a benchmark yet I would bet some money that I know the winner.
I am doing frequent fio benchmarks with my zfs filesystems. I have 64 GB of RAM and I have an internal RAID10 consisting 4x Western Digital WD40 EAFX. They are not really fast. The fio benchmark with one job of size=64 GB gives read/write speeds around 260-300 MB/s. Which is pretty good. This is pretty close to the physical limit I assume.
But when I test for 600 parallel jobs with 100 MB each I see the true benefit of zfs. This gives a total of 300 MB/s write speed and 600 MB/s read speed. The 600 MB/s read speed is due to caching effects because the file size is just 100 MB. And this is a core competence of zfs. The caching algorithm is best in class.
I also have a RAIDZ2 in an external USB case. It consists of 6x Western Digital WD20EARX.
With one fio job of size 64GB I get read/write speeds of 220-280 MB/s via USB3. This is hard to top.
I am not claiming that zfs is the fastest RAID filesystem. But my results show that it is certainly very competitive. In other words: bad performance is nothing to be afraid about.
And on top of that you get
- checksums - if you love your data you want to have that.
- native encryption (with a very small performance impact)
- advanced volume management
- robustness. A lot has been said about the stability and robustness of zfs. It is legend.
I also did stability tests for btrfs/zfs in virtualboxes. I used the dd command from the host to destroy blocks in the guest zfs/btrfs devices. zfs recovered from each of my tests. btrfs did not.
- plenty of tuning possibilities to tune the performance for a specific purpose. e.g. datasets for postgresql or oracle databases can be configured with recordsize=4k and other smart config parameters which helps zfs to outperform any other fs.
I know this sound like a sales pitch. But I use zfs now for many years and I am convinced. With my new NVME device I gave btrfs a chance for several weeks. But was not convinced.
Ok, enough, I am sold
There are’s few things you listed up there which really caught my interest.
Specific datasets for Postgres and the ability to add SSDs as transparent cache (read that somewhere else) could give my 7 year old server quite a boost I think.
I planned to migrate the server to encrypted btrfs at some point anyway, so maybe I switch to zfs instead.
As usual, the Arch wiki is very detailed, anything else you think I should read up on beforehand, especially for someone currently running btrfs?
Just a couple of thoughts:
Dont forget to use the
ashift=12 option when you create the pool This parameter can not be changed afterwards. Nowadays zfs is supposed to set a correct
ashift value automatically. But you never know. There are still devices out there which report wrong block sizes and can confuse zfs.
Adding an SSD cache is not necessarily helping performance. Especially when the data volume is small. If you only write a few gigabytes at a time zfs will handle this sufficiently fast in memory. And besides of that, if you use a redundant pool with any RAID level you want to have a redundant SSD cache as well. Otherwise you just add a single point of failure. So instead of investing money in an SSD cache I would invest money in more RAM.
Give zfs the full devices and not just partitions. zfs has its own i/o scheduler and that will only work if it owns the full device. If you just give it a partition the kernels i/o scheduler will be used because of the other partitions on the device. That is not optimal. Therefore, set the linux i/o scheduler to
none for the device and give zfs the full device.
Dont store data in the top most parent dataset but use child datasets instead. zfs datasets come with no cost. You can have as many as you want. Example:
zpool create tank -o ashift=12 mirror /dev/disk/by-id/id-one /dev/disk/by-id/id-two
This will mount the new pool in
/tank. This can be changed later by setting the mountpoint option. (another nice feature with zfs: there is no need for /etc/fstab. zfs handles the mountpoints).
Dont store data in /tank but create extra datasets:
zfs create tank/pictures zfs create tank/pictures/jpg zfs create tank/home zfs create tank/home/myuser etc.
For each of these new datasets you can set all properties independly: encryption, recordsize, etc.
But first you should set some default properties for /tank because the properties are inherited to all new children. This saves you some work. I use the following defaults:
recordsize=1M compression=lz4 atime=off xattr=sa acltype=posixacl
The recordsize parameter has a performance impact. For scenarios with lots of sequential read/write activities the high value of 1M give best performance. But for more random read/write scenarios the default value of 128k is better because it gives lower latency. Therefore I use 1M for all my data stores and 128k for my home directory. For datasets which host databases you might want to go down to 4k.
And here are some educational links:
Aaron Toponce zfs pages:
Batchelor thesis about zfs In german:
The oracle ZFS handbook is still a good source although openzfs is independent of oracle:
Thats it for now.
Cool! Thanks a lot. Lot’s of food for thought over Christmas
- Set as few options as possible on the pool itself, instead set you options on the datasets. If you set an option on a pool and want to change it later in many cases you will have to destroy the whole pool and start over. It is much easier to migrate data between datasets inside a pool.
tankis the name commonly used in examples. Don’t actually call your zpool tank
- Unlike btrfs, there doesn’t need to be a relationship between the hierarchy of datasets and the filesystem hierarchy. Use this to your benefit. For example, you could have an top level dataset called
encrypted. If you encrypt that dataset, all it’s children will automatically be encrypted. Likewise, when you unlock that dataset, all the children will be unlocked.
- Play with it in a VM first so you are understand how it works before trying to plan out your dataset hierarchy.
That’s what I plan to do, looks like the first hurdle is to somehow get to a system which supports zfs. From what I picked up so far what I need to do is the following:
- Get an os going on a small drive so I have something running I can install zfs support into
- Add the ZFS kernel repo
- Switch to ZFS kernel
- Add a new disk (virtual) which will become the ZFS drive
- Start recreating what I do on btrfs today (snapshots/backup/bootable snapshots)
If I was to map this to my current physical installations I would have to do a reinstall from an Arch ISO with ZFS support added to be able to put the OS onto ZFS as well?