I just watched a video about how, many years ago, Steam deleted someone’s entire computer, including an external backup drive that was plugged in, due to a poorly written Bash script. The script in question contained this (including the ominous comment):
# Scary!
rm -rf "$STEAMROOT/"*
Earlier in the script, the variable STEAMROOT
was set using some cd
and pwd
nonsense, which under unforeseen circumstances failed and left the variable empty, causing rm -rf /
to execute thus destroying all user owned data on the entire system.
Obviously, this was an awfully written script. But, it came from Valve, so you know it was run on thousands of computers. Maybe you ran it, too.
If such an outrageously bad script could pass Valve’s code review and quality assurance, think about the scripts you run almost every day, for example, when you update some package that was built from the AUR. If you are not thorougly inspecting PKGBUILDs, you’re tickling the dragon’s tail.
Here is another example how a space in a third-party script caused someone to lose their /usr
directory. Stuff like that happens, all the time, rarely due to maliciousness, but often due to incompetence or carelessness.
Sometimes, things like updating the kernel can cause data loss. Even on ext4, even with LTS kernel.
Speaking of incompetence and carelessness, there is also the user error. Sooner or later you will mess up. More times than I’m willing to admit, I have typed in some command, triple checked everything, hit Enter… and then immediately realised what I have done. The cold sweat, the hands shaking… and all I could utter was: “oh, ”…
Careless mistakes happen, and they will continue to happen. They also happen to much smarter and more experienced people than me.
System cleaning utilities like BleachBit are notorious for often removing more than the user wants. While using such tools is generally unnecessary on GNU/Linux and thus foolish to put it mildly, there are people who like using them, regardless of the risk.
Finally, there is hardware failure. All hardware eventually fails. All HDDs and SSDs will die, sooner or later. When they do, the data on them will probably be irrecoverable, for all practical purposes. It’s not a matter of if, but when. Good drives usually last long enough that they become obsolete, but even the best drives fail. Think of all your hardware as expandable.
In any case, whether it is your own fault or because of the things beyond your control, data loss is a very likely possibility.
If you have a good backup, any issue you might have with your system, no matter how bad, is just an annoyance.
Given how storage is fairly cheap, it is downright stupid not to have a backup.
What constitutes a good backup?
Obviously, the more important the data, the more resilient the backup has to be. But in general:
-
The backup has to consist of multiple copies on different storage. One copy is like zero copies. And if you keep multiple copies on the same drive, you’ll lose all of them when the drive fails. Use different physical drives.
-
The backup should be physically separate from your computer. A stupid command like
rm -rf /
will wipe all user-owned files on every mounted drive on your computer. Even unmounting a backup drive is not enough, a power surge or a faulty PSU can destroy all the hardware on your machine. A network drive is also not good enough. Important files should be kept backed up on drives that are physically disconnected from any running computer, and connected (manually) only when backup is taken or restored. -
The backup should be resistant to being overwritten by corrupted data. By the time you make a backup, a file might already be corrupted without you noticing, in which case you will save a corrupted file to your backup. What is really bad is when you overwrite a good version of the file with a corrupted file. Simply copying stuff to a different location, while infinitely better than no backup at all, is not good enough. Keeping old versions of backup is a good idea, an incremental backup where nothing is overwritten is also good. Using a backup solution like Borg is excellent, but keep in mind that the extra complexity can cause issues, too. Of course, avoid any proprietary backup software – proprietary formats are not future-proof by definition.
Exactly how you do it is entirely your choice. But if you’re not doing it already, rethink your life and start doing it.