Btrfs as a home user - why?

Beardedgeek72 · January 8, 2021, 4:35pm

But how much space does it use up? If I have a 300GB /home, with 200Gb empy space… that seems not doable.

And as I say I believe I am an uncommon LINUX user, but I am a common HOME user. That is, I use Linux as I use Windows: To start a browser in. 99% of my day by the computer is spent in a browser, I don’t code, I don’t “work” on my computer. I watch movies, I surf and I play games (dual boot).

dalto · January 8, 2021, 4:45pm

The snapshots themselves use very little space inside the source partition.

If you replicate them to a second location, it will use the same amount of space in both locations. So if you replicate the entirety of your /home partition, it will take 300GB in the second location.

Of course, you would probably be using compression on btrfs so it would probably take less space in both places.

All that being said, you seem to be looking at btrfs purely as a backup tool and that is not really the point of btrfs or other advanced filesystems. snapshots are just one of the many advantages these types of filesystems offer. There are also disadvantages to consider as well.

If you are looking for simplicity, btrfs isn’t going to make things simpler. It adds features which are not present in basic filesystems which inherently makes it more complicated. In theory, you would use btrfs because you want those features.

Being a home user has nothing to do with it either way. There are lots of benefits for home users many of which are detailed above already.

Beardedgeek72 · January 8, 2021, 4:52pm

The thing is that for the average user the snapshots IS the main selling point for Btrfs for someone not managing servers or a business setup.
The rest is… good for a software engineer or computer technician but being able to recover lost data is THE selling point.

dalto · January 8, 2021, 4:54pm

That is silly. Have you investigated all the other benefits?

For example, in what way is high-speed transparent compression not useful to a home user?

Almost everyone on the board is a home user when it comes to linux.

Even if snapshots are the main selling point, you are still thinking of snapshots like backups and not snapshots.

Also, all of this functionality get more interesting when you consider how much flexibility you can add with lots of subvolumes.

To get real value out of btrfs or similar filesystems you need to be willing to open your mind and think differently about your data. It is a paradigm shift compared to traditional filesystems.

Schlaefer · January 8, 2021, 4:55pm

Keep mind that a btrfs snapshot doesn’t duplicate files. It doesn’t even reference files, it references on the block level. E.g. from here

Naturally, Btrfs uses extents as well. But it differs from most other Linux filesystems in a significant way: it is a “copy-on-write” (or “COW”) filesystem. When data is overwritten in an ext4 filesystem, the new data is written on top of the existing data on the storage device, destroying the old copy. Btrfs, instead, will move overwritten blocks elsewhere in the filesystem and write the new data there, leaving the older copy of the data in place.

Copy-on-write also enables some interesting new features, the most notable of which is snapshots. A snapshot is a virtual copy of the filesystem’s contents; it can be created without copying any of the data at all. If, at some later point, a block of data is changed (in either the snapshot or the original), that one block is copied while all of the unchanged data remains shared.

So it depends on you workload. If you do e.g. video editing and the system constantly creates GB of new data, that is a problem. If you mostly work with text files we are talking a dozen MB overhead for an hourly snapshot. As a “normal” user you can make hundreds or thousands of snapshots with 200 GB free storage.

I’m happily running 1 hour rotating snapshots on my 256 GB laptop (with 48 hours max).

Beardedgeek72 · January 8, 2021, 5:09pm

I am not saying it is not, but it is quite frankly nothing that the average home user even thinks about. The last time I used compression and thought about it was in Windows XP, because hard drivers were still frakking expensive and trying to squeeze out another 100Mb was important.

As for flexibility and subvolumes… now you are definitely NOT talking about the average home user, and that is one of my points. No average home user ever thinks about subvolumes.

And again, a snapshot is a backup, but not all backups are snapshots.

dalto · January 8, 2021, 5:10pm

A snapshot is 100% not a backup. It is literally the same copy of the data. You are continuing to not understand here.

Uhh…nobody thinks about features that they don’t know about. That doesn’t mean they aren’t useful to a home user.

If you have a preconceived notion that this isn’t useful to you and want us to convince you otherwise that probably isn’t going to happen.

You have to take the opposite approach, explore all the options and functionality and then decide if the pros are worth the cons.

Schlaefer · January 8, 2021, 5:14pm

Some real world data: My main desktop has / and /home on a 500 GB SSD with ca. 100 GB free and 1 hour/48 hours rotating snapshots. According to this I have:

Unique File Extents  Extents added ontop   Extents added ontop of
per       subvolume  of previous subvolume current(act) subvolume
---------------------|---------------------|----------------------
SubvolumId       Size                  Size                   Size
5      0.00B                 0.00B                  0.00B
3507   59.31MiB             250.96GiB              250.97GiB
3506   19.75MiB              37.27GiB               37.27GiB
3505   55.78MiB             250.96GiB              250.96GiB
3504   16.75MiB              37.26GiB               37.26GiB
3503   52.90MiB             250.96GiB              250.96GiB
3502   16.25MiB              37.26GiB               37.26GiB
3501   55.21MiB             250.93GiB              250.93GiB
3500   24.54MiB              37.26GiB               37.26GiB
3499   54.66MiB             250.93GiB              250.93GiB
3498   23.40MiB              37.25GiB               37.25GiB
3496   53.29MiB             250.79GiB              250.79GiB
3495   15.00MiB              37.25GiB               37.25GiB
3494   53.81MiB             250.78GiB              250.79GiB
3493   14.46MiB              37.25GiB               37.25GiB
3489   97.08MiB             250.76GiB              250.77GiB
3488   15.32MiB              37.25GiB               37.25GiB
3487    2.64GiB             250.72GiB              250.72GiB
3486   15.05MiB              37.24GiB               37.24GiB
3485   52.27MiB             250.73GiB              250.73GiB
3484   35.83MiB              37.13GiB               37.14GiB
3483   42.88MiB             250.73GiB              250.73GiB
3482   34.25MiB              37.13GiB               37.14GiB
3481   52.27MiB             250.76GiB              250.77GiB
3480   34.84MiB              37.13GiB               37.13GiB
3479   55.14MiB             250.76GiB              250.76GiB
3478   35.66MiB              37.13GiB               37.13GiB
3477   56.20MiB             250.76GiB              250.76GiB
3476   45.54MiB              36.87GiB               36.88GiB
3475   83.69MiB             250.76GiB              250.76GiB
3474   52.48MiB              36.62GiB               36.62GiB
3455  140.35MiB             250.66GiB              250.66GiB
3454   42.23MiB              36.28GiB               36.28GiB
3423  171.37MiB             250.32GiB              250.32GiB
3422   92.67MiB              36.52GiB               36.52GiB
3393  173.75MiB             250.33GiB              250.33GiB
3392   86.52MiB              36.15GiB               36.15GiB
3360    3.98GiB             250.88GiB              250.88GiB
3359    1.17GiB              35.89GiB               35.89GiB
886   59.81GiB              59.81GiB               59.81GiB
261      0.00B                 0.00B                  0.00B
260      0.00B                 0.00B                  0.00B
257  417.94MiB             251.00GiB              251.01GiB
256   22.05MiB              37.28GiB               37.28GiB

Second column shows the diff. The ca. 37 GB subvolume is / and the ca. 250 GB subvolume is /home.

That said, I use a different drive for big, volatile data (~/Downloads, ~/VMs) and these are not included in those snapshots.

Beardedgeek72 · January 8, 2021, 5:15pm

I think we have a linguistic definition problem here, not a logical one.

If I can reach back and restore something from data that has been preserved, then it is, indeed a backup. It doesn’t matter if it is actively copied to another place, or another folder, or if it is preserved by the file system deliberately saving the new data on another block on the drive to preserve the first version of the data. It is, still, a backup.

dalto · January 8, 2021, 5:17pm

The data isn’t preserved unless it is changed. In simple terms if a block becomes corrupted it will corrupted in both the current data and all the snapshots that reference that block.

It is literally not a backup by most definitions of a backup.

Schlaefer · January 8, 2021, 5:17pm

There is a line of thought that says: if it’s on the same physical medium (same point of failure) it isn’t a backup. Snapshots are usually on the same medium.

Beardedgeek72 · January 8, 2021, 5:18pm

Now, this is very interesting. The question now becomes A) how easy is it to set up, B) how easy is it to actually access and C) Can I ALSO easily have off-drive backups?

The big benefit with having an off drive backup, power surges and drive failures aside is that when I reinstall the system everything is already backed up, I can just format the drive and copy everything back after the fact.

Beardedgeek72 · January 8, 2021, 5:20pm

Again, there seem to be a linguistic definition here. I accept it, but to me, if I copy a text file in the same folder it is still a backup for the text file I am working on.

2000 · January 8, 2021, 5:25pm

Timeshift definitely hasn’t got this feature. Some of the tools in the link @anon31687413 posted above seem they could get the job done.

In the end, all it would take is creating a very simple send/receive script once that you could then invoke manually, or per cron, per hook on snapshot creation, etc. Not really that much work . If you go the incremental route this will also be much faster than simple rsync backups in the long run.

But there’s no shame in sticking to the good old trusted rsync, if your main reason for looking into btrfs is just creating reliable backups.

On a side note:

Last time I checked a freshly installed EndeavourOS offline install actually only took up about 2.5GB of real space. Right now my root on this laptop has a uncompressed size of 13GB and a compressed (with zstd) size of about 7.7GB. Due to the nature of most of the larger files (video, music) my home subvolume doesn’t compress quite as well but overall I find this very impressive for transparent compression.

Schlaefer · January 8, 2021, 5:28pm

It depends surely on the definition. Can you restore the data if the original is lost? What does lost mean? Did you accidentally overwrite one file? Did the drive die? Was the computer stolen? Did your house burn down?

The minimum academic definition for a backup would probably require at least two off side backups in two different places.

Schlaefer · January 8, 2021, 5:37pm

I don’t see snapshots as backups, so this resolves very easily by keeping them separated. One hour same drive internal btrfs-snapshots for 48 hours. Daily external backups not depending on snapshots at all - like it was a ext4 drive.

Btrfs has snapshots build in, so it’s a question of which tool triggers it and which kind of convenience is desired. Personally I use btrbk for same drive snapshots and the file system to browse the backups if necessary. For external backups I use Back in Time with rsync.

It’s possible to utilize btrfs snapshots as backup source for improved performance (and more?), but I haven’t bothered yet. Daily snapshots with rsync take maybe five minutes of background work here. It’s fine.

anon31687413 · January 8, 2021, 5:38pm

More generally, you do not use btrfs to have “easier” backups.
You use it because of the COW feature and everything that comes with it.
You also use it because it allows you to easily add new devices to existing ones, and because it supports RAID 0/1.
Running out of space? No issue! Add a new drive and simply run btrfs device add ... (*) to add it to your existing one.

Fun thing: you do not even have to create partitions for btrfs. You can simply assign it a whole device.

(*) followed by a btrfs balance

Beardedgeek72 · January 8, 2021, 5:55pm

First of all the partitionless option is STRONGLY adviced against by the Arch Wiki, at least.

As for the rest: That is sort of my point. For the normal home users, which, I stress again, is NOT the average Linux user (which btw again illustrates why Linux is never going to be a main option for the desktop for at least 15 more years) none of those features are relevant, really. ESPECIALLY not if you have to use the terminal or TTY to do it.

The two things people in general, if they have heard of Btrfs at all, is: Snapshots and Lost Data. Luckily we are getting away from the second one, but for a lot of people btrfs is just shorthand for “I know someone who lost all his data because he used it”. Maybe, as pointed out above, compression.

Now, I am interested in trying this out, especially because of the compression feature because why not. Technically my Windows install is my definitive backup since I can always boot back into Windows no matter what I muck up…

Am I correct in assuming I need to manually edit the fstab to enable compression AND force a defrag for it to take on already installed files?

anon31687413 · January 8, 2021, 5:57pm

Yes, because it can create problems with the bootloader.
I tested it long ago with GRUB and it seems to work just fine.
But you’re right, it’s not ideal if you want to boot from the device.

Compression can be enabled at any point, but it does not apply automatically. You have to run a defrag operation to apply it to already existing files.

EDIT: you are also right that btrfs might not be the best choice for the average/normal/n00b user. One exception might be distros which have btrfs and auto-snapshots preconfigured and ready to use without much user intervention.

Schlaefer · January 8, 2021, 6:09pm

Afaik still new and not recommended. Also afaik not an option if you want encryption.

Let’s not pretend that the normal, other parts of the partition scheme with boot/efi, swap/sleep, root and maybe data is easy to setup. The GUI installer is doing that for the normal user, no matter if ext or something else. If the GUI installer offers btrfs by default, which they do more and more, this issue is going away too.

That’s a nice btrfs feature too.

Usually you set the desired compression algo and auto defrag in fstab (or when mounting). Every newly written data uses the currently set compression, every existing data stays and is read with the compression it was written in.