An Answer to the Backup Question?

I’ve been posting a lot about backup.  Well, not a lot a lot, but pretty much every year or so the past half decade or so, I have solicited the wisdom of the internet brain trust as I was searching for a backup solution. My backup needs are a bit atypical; they are not unique, or this post would be pointless, but they are not the same as for your everyday household.  I have a rather large amount of data and a rather large number of computers on different platforms I want to backup. I think I have around 5-6 TiB of data (a lot of movies converted from DVD, photos, music, …).  I want that backed up because I’m way too lazy to rip it again. I have a couple regular computers, some with Windows and most with OS X.  I want those backed up.  Then I have around 20 Linux machines.  The number varies.  Most of them are live servers, some are phased out servers I keep alive because I’m too lazy to decide whether they contain anything valuable, and some are templates for setting up new machines.  The Linux machines all run on an ESXi server. This means I cannot conveniently use the standard solutions: hosting ~10 TiB of data is pretty expensive with most per-amount solutions, and most per-host solutions are annoying/expensive/both if you have over 20 machines you want to back up. In the good old days, I used CrashPlan.  They had an affordable family plan with unlimited storage for up to 10 machines.  That was pretty ok; at the time I only had 10 important machines. Then CrashPlan hiked their prices, and I didn’t renew my subscription.  Instead I went with a combination of Amazon Cloud Drive and CloudBacko.  At the time, ACD provided unlimited storage for $60/year, and CloudBacko was a one-time investment for a license.  CloudBacko is licensed per host, but with a bit of scripting, I could set up a dedicated backup host and back up individual hosts over ssh.  CloudBacko is also able to back up ESXi hosts directly. Unfortunately, this solution has also become nonviable.  CloudBacko has decided that “free forever” means free until 2017, and they are now charging for upgrades (“you can keep using the old version forever” – blah, that will last for 2 weeks until somebody changes an API).  Furthermore, ACD has tuned real shitty fast.  It has increased the price to $60/TiB/year and has stopped issuing any new 3rd party API keys.  That means that no new 3rd party backup applications will be added and the subscription fee went up by a factor 8-10 per year for me.  Simultaneously, CrashPlan went tits-up (they are now “focusing on their enterprise customers”). This all adds up to me having to find a new backup solution.  There’s two big cloud backup solutions remaining: Carbonite and BackBlaze.  Both work on Windows and Mac; BackBlaze is $50/year per computer and Carbonite starts at $60.  There’s also Mozy, but they offer a laughable 50 GB plan, so there’s no need to consider them. It turns out that BackBlaze additionally has a new cloud service.  It’s $60/TiB/year.  That’s the same as the “new” price for ACD, but they didn’t do a bait-and-switch on me so I don’t hate them already.  BackBlaze also has a working 3rd party API and actually encourages usage of their service sot that’s a plus. I looked at third party backup solutions on BackBlaze’s list (they don’t support Linux natively), and eventually narrowed it down to Duplicacy, Duplicity and Restic.  This handy overview compares them (along with another one that does local backup only).  Restic takes up way too much space, and Duplicity is not good at incremental backups (you need the entire chain of incremental backups to restore with them).  surprisingly, the comparison is made by the author of Duplicacy (fuck that name!). Duplicacy is a commercial product and costs $20 for the first year and then $5/year for the first computer or $10 for the first year and $2/year for any extra computers after that (or $50 / $10 for commercial use).  It has a command line client which is free for personal use (or $20/year for unlimited usage for commercial use).  It supports backing up to around 10 cloud providers, including BackBlaze.  That means that if BackBlaze becomes shitty, I can switch to another provider.  I can even use my extra Dropbox storage (I have a 1 TiB plan there) for regular backups as they don’t charge for API usage.  Duplicacy uses a per file (fragment) deduplication, meaning that files shared among my Linux servers are stored once only, which is pretty neat. Not only that, there is a clone of Duplicacy which runs directly on ESXi and can back up virtual machines directly.  It is called Vertical Backup and shares file format and command line syntax with Duplicacy.  It supports thin provisioning (a thing not supported by CloudBacko).  Vertical Backup is just $10/year for personal use and 10 times that for commercial use. Finally, I noticed that my NASes (both Synology) support BackBlaze cloud storage semi-natively.  I can sync to their cloud, and the cloud provides versioning and custom retention which is close enough to backup for me. At the end of the day, that leaves me using Duplicacy on all servers, Vertical Backup on ESXi and BackBlaze on my desktop.  I back up my servers both inside the VM and as whole VMs.  The whole VMs go to BackBlaze (though Amazon Glacier is also an option) and the individual server files to Dropbox.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.