The importance of backups
If you have ever lost data then you were probably wishing you had a second copy of that data. I had such an experience myself many years ago. I powered up my computer as usual, expecting it to boot up normally. Instead I was greeted by some weird noises, kind of like grating noises. The computer did not boot. I was greeted by my bios telling me that it could not find any bootable hard drives. I can't begin to describe the sinking feeling inside me as I realized that there was nothing I could do. This was back in the early 2000's so I didn't have any SSD, just your plain old HDD. I never found out exactly what was wrong, but the grating noise from the HDD made it clear it was an actual physical failure. If it had been a software problem I might have been able to rescue some data, but physical problems with a hard drive is a lost cause. My data was lost, and I had no backup. This taught me a very important lesson, always have backups of important data. Everything that I can't easily recreate or regain from other sources needs to be backed up.
This old experience of mine is reminding me of a saying I've come across: "If you don't have a backup then you don't have data."
So then how do you take a backup? I know some people rely on certain forms of software that create snapshots of your OS install. Timeshift is one such alternative, and it is a great tool for rolling back changes on your system. But these tools are not backup tools because if the actual hard drive is damaged then you still lose all data. If you have a backup stored on the same device, then it is not a backup. Any failure on that device has a high probability of rendering your backup useless. So for a backup to actually serve any purpose as a backup, it needs to be on a different device. Personally I use an external hard drive, I can easily connect it to my computer when I want to "update" my backup, and then I simply store it in a drawer most of the time. This form of backup is useful for most cases, it's extremely rare that you will have a failure on your device and an external hard drive or other device at the same time. But if you want to be serious about your backup then you need to plan for this. Just to use a very simple example, if your home burns down then you lose both devices if they are kept in your home.
You should definitely keep a backup at home, it is the most convenient and probably the one you are most likely to need, but make sure to also keep a backup somewhere else. You need to plan for the worst case scenario, so we need off-site backup. A simple way to do this is to have another external hard drive with a backup, and store it somewhere else than your home. You could also use a cloud storage solution, but if you are storing large amounts of data then it could become expensive. The obvious positive sides to using some form of cloud service is that you can upload a backup whenever you want. If you use an external hard drive that you store at another place, then you need to retrieve it whenever you want to update your backup.
At this point, we have an idea of what kind of backups we need:
- Backup that are kept at the same location as the data, for convenience
- Backup kept at some other location, for disaster recovery
This is what I would consider a minimum. At this point you essentially have three copies of your data. The first copy is the "live" copy, which is on the computer you are actively using. The second copy is the backup you put on an external hard drive at the same location. The third copy is the backup you keep at some other location. It is important to ensure that your backups are kept up to date with the original data. Personally I tend to update the on-site backup at least once per month but ideally more often. The off-site backup is something I update much less frequent, about once every 6 months. How often you should update your backups will depend entirely on how frequently your data changes.
Alright, we have gone through some ideas around backups, but actual implementation is important. I am not going to go into all the details around this, but look at some easy methods. This will definitely not be a guide suitable for a professional setup, but it should be adequate for most home computer users.
The actual data you want to backup is a decision that varies from person to person. As a general rule of thumb, go for data that is unique. For most people this is likely to be private photos and documents. Data that is difficult to reacquire would also be a good choice for backup. As a gamer, I consider save data from games as something important to include in a backup. Even though many games have cloud storage for save data (pretty much all Steam games), I also play many older games. All my GOG games are also completely offline and don't have any cloud storage for saves. I definitely don't want to lose any progress in games, so save data is always included.
The easiest solution here is to have an external hard drive. For backups I mostly just use HDD since they are much cheaper than an SSD. The speed of the drive does not really matter since it will not be frequently accessed. As for how to get your important data over to the external drive, there are several ways. You could simply copy over directories/files normally, and in many situations that would probably suffice. It might be helpful to find some software that helps in this process.
If you are a Linux user, then rsync is an excellent tool. There are lots of articles to be found that detail the options of rsync, the Arch Linux wiki has some excellent pointers on different use cases: Arch wiki: rsync (link opens in a new tab).
The goal is simply to get extra copies of your files over to an external hard drive, so don't make it too complicated.
There are two ways approach this. The easiest solution is to go the same way as with on-site backup. Just get yourself an additional external hard drive, copy all your important stuff over to it, then store it somewhere else. It is important that you store it in a completely different building. If your home burns down, destroying your storage devices at home, then you still have your off-site backup.
The only real downside to this solution is that it requires some effort to update your backup. A few months after you took the backup, you might realize that you have new important data that should be added to the backup. In this case, you would need to retrieve the external hard drive, update the backup, then take the drive back to the storage location. It might be inconvenient depending on where you store it, but this kind of backup is something you should not need to update too often.
Lastly, I would highly recommend encrypting the hard drive used in this case. Encryption that you can unlock with a (complicated) password should be good enough for most use cases. Just don't forget the password or your backup will be useless. Since this drive will be stored somewhere else, there is always the chance that someone could access your data. If the drive is encrypted, then your data can't be accessed by anyone else.
There is another way to have off-site backup and that is to use a cloud storage provider. The only things speaking in favor of this is the convenience. Instead of needing to move a physical storage device somewhere else you can simply upload it to cloud storage. The downside to cloud storage is the associated cost. If you happen to only have a couple of gigabytes of data, then you might be okay using some free storage providers. But if you have have hundreds of gigabytes, then it might get costly. Another issue is the process of uploading. If you have large amounts of data, then it will be extremely inefficient to upload single files at a time. A better approach would be to put files into archives so that you upload a smaller amount of archives.
No matter what approach you go for, encryption is very important for off-site backups. If you are using an external hard drive, make sure everything is encrypted. Preferably just use an encrypted partition so that every file you put there is encrypted. If you decide to use a cloud storage provider, then encrypt the archives before you upload them.
Note on encryption
There are multiple ways to encrypt stuff, which makes it extremely important that you know you can decrypt your data in the future. If you use any form of password encryption, then a strong password is important. Just make sure it is something you will still remember when disaster hits.
If you use key based encryption, then this key must be retrievable for you even if you lose all storage devices in your home. Meaning that your encryption key must be stored in a different off-site location, ideally two separate locations, which should be different to where you store your backup data. The encryption key itself should also be encrypted behind password encryption. Otherwise, someone could gain access to your encryption key.
The most convenient way to encrypt your stuff is just to rely on password based encryption without any key. But for stronger security, key based encryption is the way to go. The downside to using key based encryption is that there is slightly more of a learning curve as well as the added responsibility of keeping your key safe. I strongly recommend taking the time to learn a bit about how something works. This applies to anything, but especially encryption.
Make sure that you verify that your backups are correct and that you are able to recover your data from your backups. This should be done at frequent intervals, not only for verification itself but to ensure that you know how to recover your data. In the event that you really do need your backups, then it will be a simple process to recover your data.
There is nothing worse than having never verified your backups, only to learn that your backups are incorrect/inaccessible when you desperately need them. This is especially important for off-site backup. As an exercise I suggest retrieving your off-site backup, but with the working assumption that your computer was destroyed and you don't have access to any of your data. Ideally do this from a different computer with a clean OS install. Such a simulated disaster recovery will help you become aware of any potential issues you might have overlooked.
Shameless self promotion
I have created a piece of software that is designed for the purpose of automating the process of creating multiple backup archives. Put very simple, my software creates archives that can be optionally compressed (to take less space) and encrypted. These archives will then optionally be uploaded to a cloud provider. This all happens automatically, and as a user you simply have to start the program with the desired options. If the process aborts at any step, then you may simply resume the process instead of starting over.
My backup software is intended for competent users who are familiar with the concepts, or are willing and able to learn. There is no graphical interface, all options are set in a configuration file and then you run the software via command line. Interested users can look it up on my Gitlab: cebac (opens in a new tab).
I also have an AUR package for Arch Linux: cebac AUR (also opens in a new tab).
A very brief summary of my recommendations:
- Three copies of your data
- Live copy, on the device in use
- On-site backup: external device/storage at the same location
- Off-site backup: different device/storage at another location
- On-site backup should be updated frequently to minimize data loss in the event it is needed
- On-site backup is the primary backup
- Off-site backup is intended for disaster recovery
- Encrypt your backups, prevents potential data theft
- Verify your backups, both on-site and off-site
- Keep your backups up to date with any changes in your important data
- Backup everything that you don't want to lose.
Categories: Free Software