Backing Up Data: An Inexpensive Alternative Solution

For a period of time over the past year or so, I have had no backups of data on my personal computer.

Yeah yeah, I know this is a tremendously bad idea and that if a drive goes or a fire happens I would lose everything that wasn’t on my Google Drive. For some time I was using a service called Symform, which let me back up as much as I wanted for free, so long as for every gigabyte I backed up that Symform could use several gigabytes of my drive to backup other people’s data. This seemed like a marvelous solution for a cash-strapped college student who had a secondary 2TB drive with almost nothing on it. Alas, Symform was acquired by some company called Quantum, and the Symform service was shut down early last year.

Since then I’ve looked into Carbonite and Backblaze. They seem like good enough services, but $9.99 (Carbonite) or $5 (Backblaze) a month for personal plans still adds up significantly over the course of a year. As a cash-strapped college recent graduate, I wondered to myself if there was a way that I could get the job done less expensively.

As it turns out, there is.

Backing Up Data Directly to the Cloud

Microsoft, Amazon, and Google (and other providers most likely) all offer cloud computing services for running things like scalable, on-demand web servers that don’t go down when your power does. These companies additionally offer cloud storage options as well. They are set up to take advantage of economies of scale, and are all in a knock-down, beat-’em-up fight with each other to gain the upper hand in this market. What this means for us as tech-savvy consumers is that we can back up our data directly to the cloud at a price of pennies a month per gigabyte.

This is currently what I’m doing to back up data on both my main rig and my webserver. I’m using a service provided by Google called Google Nearline; if you decide you’d like to do something similar to back up your data (or if I forget how the heck I did this and need to set it up again five years down the road), I wrote down the steps I took so you or Future Michael can get started with doing the same thing.

Google Cloud Nearline

Google Nearline has a splendid whitepaper with more information published here https://cloud.google.com/storage-nearline/nearline-whitepaper with more details on their pricing schematic published here https://cloud.google.com/storage/pricing. For those of you for whom that’s a TLDR, the short story is that Google Nearline will store your data for $0.01 per gigabyte. You do need to pay extra to retrieve or delete the data, but that only costs $0.01 per gigabyte as well. Although unless something goes horribly wrong you won’t need to retrieve your data anyway, and let’s be honest, if you really *do* need to retrieve that data, $0.01 per gigabyte is not that steep an extra price to pay. Once you read over the whitepaper and are as pumped for inexpensively backing up your data as I am, go ahead and mozy over to https://cloud.google.com/ to sign yourself up with your Google account. If you haven’t used Google Developer Tools before, you can opt to start a trial when you set up your account. Doing this will ask you for a credit card to verify you are a real person; I did this and haven’t seen any small charges come through, so you should be safe to do this no matter what your balance is.

Once you have familiarized yourself with the Google Cloud Console, install the gsutil tool onto your machine. gsutil runs in the terminal, and is free to use. If you do some Googling you may even be able to find a wrapper utility that runs gsutil in a graphical user interface. Instructions are provided by Google at https://cloud.google.com/storage/docs/gsutil_install for Linux, OS X and Windows. Once the host set up, link the tool with your Google account by running gsutil auth login in the terminal; this will give you a link to paste into the browser, which will give an authentication code for the tool.

Create Backups

I’d like to take a moment to give credit where it’s due for this next part. Bradley Falzon saved me a good bit of time with his article at https://bradleyf.id.au/nix/google-storage-nearline-linux-backups/; for the sake of preservation, I copied what he wrote at the bottom of his tutorial.

Create a bucket, see gsutil help mb to see a complete list of options, such as specifying bucket region.

# gsutil mb -c nearline gs://bucket_name

Now, perform your initial rsync

# gsutil -m rsync -r /directory gs://bucket_name

The -m option runs a parallel rsync.

For future backups, use the -q option to hide all output but errors, this is useful for cron, so it will only email if an error occurs.

# gsutil -qm rsync -r /directory gs://bucket_name

Faster CRC32 Checksums
======================

Note, by default it’s likely rsync will use a slow method to calcualte CRC32 checksums. For a faster method it’s
recommended to

$ gsutil ver -l | grep crcmod ```

If the output shows compiled crcmod: False, then install the compiled module by following the instructions in gsutil help crc32c – which essentially uses pip to install crcmod32.

Restore Backups

Remember you will be charged your monthly fee to retrieve your data. Restoring backups is much the same as creating them, but by reversing the directory and bucket parameters on rsync. So to restore data from a bucket to your computer, the following would be run:

# gsutil -qm rsync -r gs://bucket_name /directory

The rsync command is what you’re going to be using the most often to perform backups of your data. More information is given on rsync at https://cloud.google.com/storage/docs/gsutil/commands/rsync

Leave a Reply

Your email address will not be published. Required fields are marked *