Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Treat DigitalOcean as any other provider - something you can't trust, so always have backups of your own data in a place you trust (not a DigitalOcean snapshot) so you can restore when needed.

Personally I use Ansible http://www.ansibleworks.com/ and rdiff-backup http://rdiff-backup.nongnu.org/, along with Vagrant http://www.vagrantup.com/ for testing. So the day something happens with my droplet on DigitalOcean - I'll just run Ansible on a fresh server and restore the remaining data with rdiff-backup.

Yes, the lack of IPv6, being unable to use a virtual machines bootloader, a lack of a decent rescue image, and no private networking apart from one location sucks. However for the price - it's a good deal.



Please accept our invitation to use rsync.net as the "backups of your own data in a place you trust".

The "HN Discount" is still granted to new customers that know to ask about it - and we have always supported rdiff-backup, which you mentioned.

We support ipv6 in our Denver and Zurich locations...


Definitely a happy customer! +1 to rsync.net. I've only ever had 2 issues there, both of which were resolved quickly and painlessly

The first of which was when my bank account suddenly got drained by a subscription plan to rsync - This later turned out to be an issue with the Paypal subscription, rsync refunded everything within an hour or so of me reporting the issue despite being on quite a different timezone, I was surprised someone was awake!

The second was that I was on somewhat of a very old contract (Going by emails I've been using them on this account since Sep 2010 and via nearlyfreespeech since March 2009, wow time flies!) and my storage space vs price was a bit off compared to their most recent pricing/HN offering. Quick ticket and within a couple of days everything was updated and I was on a much better pricing plan

More importantly than all of that, I've never lost data there - The only ever times I've been lacking a backup is when the backup was never pushed out to their servers in the first place due to my offsite script failing to run

As you can probably tell I'm a little bit of an rsync.net fanboy.. ahem :--)


Yep - I'm also using Ansible and Vagrant, in conjunction with machines on my home network, reverse ssh tunnels, and API manageable DNS. I'm working towards having services able to be dynamically/automatically moved between various VPS providers (Digital Ocean/AWS/Hetzner/NineFold) without me worrying (or even being aware) which VPS provider is currently being used.


How would you do it in such a way that you are not even aware?


The plan is a mail server where the hardware/storage is in my home, opening reverse ssh tunnels for ports 25 and 465 to inexpensive VPSes regularly created and destroyed via APIs, with DNS MX records updated automatically. That way my world-visible MX endpoint will regularly change IP address, and move between US, European, and Australian based datacenters. The VPSes are configured to not store or log anything, and to always attempt to initiate SSL/TLS connection (with STARTSSL SMTP messages).

I'll know which VPS providers I have accounts with (and anybody curious could also find out by watching my zonefile updates), but at any time I won't care where the remote end of the ssh tunnels is or where the MX records are currently pointing.


OK, so you are using these systems as data relays, not to store data. This makes sense. I think it would be much harder to do this if you wanted to switch between VPSs that were storing your data.


Sure.

I've done some thinking - but not (yet) experimenting with EncFS combined with S3FS to store encrypted mountable data on Amazon S3 (I'm currently useing EncFS to store data on Dropbox & GDrive and with BTSync). No good if you need fast local access to the data (you wouldn't want to run you database this way), but it would solve _some_ of those problems. For me right now - the answer is to store my own data at home, and relay access to that data when needed.


Would it be possible for you to describe your rdiff process a bit more? Do you dump the database periodically and backup that data or do you run rdiff on the binary data directory of your DB? Also, where is the rdiff backup data stored? I am guessing you would need another server for this and cannot be just Amazon S3? Thanks!


On a local server at home, it's a low powered ARM box (QNAP TS-219 running Debian) with RAID1. The same thing could probably be done with a Raspberry Pi for lower costs if I was doing it now.

For database backups, you need a consistent snapshot of the data - something which might not happen if you attempt to access the data directly. Add a script to your daily crontab directory, such as:

mysqldump --skip-extended-insert --all-databases --single-transaction --master-data=2 --flush-logs | gzip -9 --rsyncable > backup.sql.gz

Or:

sudo -u postgres pg_dumpall | gzip -9 --rsyncable > backup.sql.gz

Into a directory which rdiff-backup will download.

I always prefer to pull my backups from a local server I can trust instead of running a process on the server to push backups - if someone gains access to the server they could potentially destroy the backups if all the credentials are left on the server.


That's very helpful. Thank you so much.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: