Data backup

anon69457321 · March 27, 2021, 5:13pm

Hello people.
How do you do your backup copies and files created in your system?
Through cli? … of GUI?

Thanks.
Greetings, Juan.

dalto · March 27, 2021, 5:14pm

Maybe it is a translation issue, but I am not sure I understand what you are asking.

Could you give an example?

anon69457321 · March 27, 2021, 5:52pm

Hello.
Thanks for your answer.
This is one of those things that throw me back continuously when it comes to wanting to learn in GNU / Linux and that it happens to me very often at the Arch wiki, for example: as I do not dominate English, I have to communicate through the translator of Google and clear … things do not always go well.

What I am pretending to do is automate the backup of those things with which I am working every day on my PC, such as photos, text files or spreadsheets.

I have a second hard drive installed (the main one is SSD, and HD secondary) on which I want to regularly turn copies of the data with which I work on my system.

Backup copies, come on. Something independent of the Snapshots BTRFS of the system.

And forgiveness for not knowing how to express myself in your language

dalto · March 27, 2021, 6:04pm

It is no problem. Lots of people here are not native English speakers.

The real problem is there are lots of ways to do this.

If you are looking for true data backups, I recommend borg. It does encryption, compression, block level depup and is rock solid. If you prefer a GUI tool, vorta is a management gui for it.

An alternative option is restic. While I believe borg is generally better, restic can more easily do cloud backups.

For a completely different approach, you could replicate your btrfs snapshots to an external volume. This is a good way to have a second copy of your data including all your historical snapshots.

There are a couple tools I would avoid.

duplicati - It just isn’t stable enough for use as a backup tool for important data. I had to migrate away from it do to multiple corruptions of backup sets. There seems to be no way to fix this when it occurs.
timeshift - While many people use this for backups it is more of snapshot tool than a true data backup solution. Also, if you are already using it for btrfs snapshots, I don’t think you can also use it for backups.

manyroads · March 27, 2021, 6:11pm

Personally I use grsync & Dropbox

dalto · March 27, 2021, 6:14pm

Personally, I wouldn’t call that a backup. Unless I am missing something, that is just a replica.

In other words, a backup solution should protect you against things like file corruption or mistakes. In the case of a replica, it will replicate the corruption/changes to the replica and the problem will exist in both places.

A replica really only protects you against a total disk failure or loss.

manyroads · March 27, 2021, 6:57pm

Here’s a pretty standard definition of Backup:
In short, there are three main types of backup: full, incremental, and differential.

Full backup. As the name suggests, this refers to the process of copying everything that is considered important and that must not be lost. …
Incremental backup. …
Differential backup. …
Where to store the backup. … (Dropbox is my offsite storage)

grsync & dropbox allow me to accomplish Full backups. I do not do incremental or differential backups. Dropbox is fast and accurate enough (for my needs) to cover my file change needs.

dalto · March 27, 2021, 7:16pm

That is a definition but I don’t think it is a standard universally accepted one.

Here is a different definition:

A backup is a copy of your current and active data which can be used for operational recoveries in the event that your data is lost or corrupted in some way. The main purpose of a backup is to restore your data to a previous point in time.

In the end, there likely isn’t any universally accepted definition so it is all semantics. However, to me, a copy of data that doesn’t protect against data corruption can’t be considered a “backup”. That is just a second copy of the data.

It is important to point out because I work with people all the time who have lost critical data because they thought their replica was a backup. When they go to recover it, they realize that the data issues were sent over to the replica and they have no way to recover.

manyroads · March 27, 2021, 7:40pm

Well I guess we just disagree. My 45+ years of software engineering & IT managemnt work says what I do works perfectly for my needs. grsync and Dropbox accomplish current and active data backup. Data corruption is a non-issue because I always have multiple recoverable sources. The most data I ever could lose is 10 minutes worth of one current edit session; any incremental cost associated with reducing that time window is simply not worth it to me.

Could I have instantaneous data duplication… the answer is no, due to my network bandwidth limits. I am not running multi-billion dollar IT networks anymore. My genealogy data and email are sufficiently protected.

dalto · March 27, 2021, 7:46pm

If what you are doing works for you, that is great. However, that doesn’t mean it is generally good advice to give to others.

So, if you save a critical file on March 1st, it becomes corrupt on March 6th and you discoverer the issue on April 6th, how does a replica protect you? By then, the “bad” file will have been replicated to all your replicas, you will no longer have a copy of the “good” file.

SimonJ · March 27, 2021, 8:12pm

I use a bash script and rclone with box, mega, google etc.

https://rclone.org/

anon79501528 · March 27, 2021, 8:39pm

Timeshift
Rsync to SMB share on NAS
Hyperbackup to 2 separate cloud shares.

One backup is no backup. Backup in triplicate, - Onsite, Offsite, Cloud.

manyroads · March 27, 2021, 9:00pm

FWIW Dropbox (I would assume Mega, Google, Box, etc as well) provides 30 days of file versioning/ recovery see:

Additionally, I maintain full backups on rotating basis on a set of removable drives via grsync. (I keep them on 4 separate drives). Sadly because the USB drives are local, I actually trust my Dropbox remote storage for distance recovery, should it ever be necessary (it never has been).

Email is included in both of the above.

I don’t use a phone for any data at all, so there are no issues there for me.

I have never had anything that was not recoverable outside the windows I maintain. If I were worried, I could run checksums on my data to act as further alert/control.

The bottom line with regards to backup is that you need to have an affordable, reasonable plan that meets your data recovery concerns & needs. If everyone adhered to those guidelines, things would be more secure.

Hystrix · March 28, 2021, 4:59am

Happened to me once . Some pictures on my drive got corrupted . I noticed it later . But because I used rsync to create backups all backups had the corrupted images . Fortunately I had those pictures on a DVD . Otherwise I would never get them back ( I tried data recovery with no luck ) .

I am not saying this will happen to everyone using replica as bakckup . But if you are unlucky you may never get your data back .

Now I am using borg .

Additionally I don’t trust the cloud , so don’t use it .

Zircon34 · March 28, 2021, 6:35am

I suspect from the OP, timeshift will do for easy backup solution.

Edit: for system backup but not for personal files. Other posts point to some better tools.

I use nextcloud on my own server with raid ( two hardrives mirrored), LAN via wifi not connected to internet, I trust it, but the few comments in this thread make me think about file corruption. It happens rarely but when it happens a versioning system is great.

Edit: will have to check how nexcloud deals with it but I also do copies of important files on external drive, monthly.

Dropbox’s versioning system helped me once recovering a corrupt file but would not recommend for large system backup, rather use an external harddrive.

Had to switch away from dropbox because moved to rural area with painfully slow internet. Learned creating my own server.

TomZ · March 28, 2021, 7:33am

Interesting discussion, I lean towards @dalto 's point of view, my single reson to make backups is so I can restore my data if I need to do, it just happens that this requires creating replicas somewhere else, and I use Borg for that. What I don’t understand though is how Borg would prevent a corrupt file from being added to the backup?

SimonJ · March 28, 2021, 7:45am

I use cloud services for daily sync backups but for pictures and music I also have a removable drive which I use monthly.

Kresimir · March 28, 2021, 9:28am

The biggest misconception regarding Timeshift is that it is a backup tool. It is not, but rather a system restore tool. The official Timeshift documentation is pretty clear about that, it explicitly warns people against using Timeshift for making backups. It’s just that people don’t read the documentation.

Other than that, I would not encourage people to avoid Timeshift. If one is not using btrfs as one’s filesystem of choice, Timeshift is a great ease-of-life utility that can restore an unbootable or poorly configured system to a previous snapshot, and it is a very reliable program at that. Btrfs makes Timeshift obsolete though.

Hystrix · March 28, 2021, 9:38am

replica replaces backup file with corrupt file while borg adds it to another location and does not delete the previous version of that file

If you sync your files to a backup location (a folder) using rsync or any other replication tool , it copies replaces the files on the backup with newer version of files . If a file gets corrupted the program sees it as a modification and replace the file on backup with this corrupted file . So after sometime when you open your file you will understand it’s corrupted . But when you look in the backup you will see that’s also corrupted . This actually happened to me .

What borg does is , it will copy the modified files to the backup and store it separately . i.e. it will not replace old file with new files . So each backup you make is an archive . If a file gets corrupted borg will copy this file to the backup but will not delete the old file like rsync does . So when you find out your file is corrupted you can look for older backups (archives) and restore the file from it .

To use rsync in a way to prevent this from happening you will have to make separate backups each time . i.e. to different locations (folders) each time . But this will use way too much space on the backup . With borg de-duplication will reduce size of each backup significantly .

pierrep56 · March 28, 2021, 9:50am

I am using 2 programs:
Timeshift for snapshots of the system state
Backintime for incremental backups of my data.

Both are running twice a day, set as a cron job.
Both are saving their data on a local external SSD disk

And both programs are set to keep one set of the last 5 days data.

I use this setting since years (First in Manjaro, since nearly 10 months with EndeavourOS). I have restored the full system and some personal data many times. It never failed me.

EDIT: And I keep on another drive one full data backup PER YEAR, for the last 5 years… just in case.