Downvote wasn't me. I agree with the sentiment regarding requirements. However, ...

contingencies · on Sept 18, 2013

I disagree with the bit regarding data redundancy and integrity.

You are welcome to disagree but I'd like to see some reasoning.

You can do it other ways, but that doesn't make it a good idea; it's a bit like Greenspun's Tenth Rule, but for data.

Had to go searching for that rule, which seems to be Lisp-snobbery which is clearly somewhat justified in theory but almost irrelevant in practice. Right tool for the job, and all that. It's such a broken metaphor for storage consistency or availability that I'm not going to comment further.

ZFS, or something like it (and there isn't anything else like it), is the foundation of any modern setup

Do you honestly view ZFS as the be-all and end-all of data storage? That would be ... sad. Other filesystems can offer snapshots and high availability, as can other elements within a storage system. For example, in Linux, DRBD is a block device driver that provides even more powerful availability guarantees that any conventional (~single-host-homed) filesystem. Likewise, LVM2 has provided block-layer snapshots for ages. Similarly, Linux is unsurprisingly the most vibrant platform for cluster filesystems. Then there's also other great general purpose tools such as RAID, signatures/checksums, and such.

If your data is important, then you'll need to look elsewhere than Linux for the servers where the data sleeps.

That's just ridiculous. I guess you're going to tell me most of the world's data lives on ZFS? Google uses ZFS? Facebook uses ZFS? Yahoo uses ZFS? Let's be realistic here: you're absolutely and demonstrably wrong, and have provided no compelling argument.

dmpk2k · on Sept 18, 2013

which seems to be Lisp-snobbery which is clearly somewhat justified in theory but almost irrelevant in practice

I agree, but you're missing the forest for the trees here. Please accept my arguments in good faith.

Do you honestly view ZFS as the be-all and end-all of data storage?

For local storage? Right now? Yes, it's the best we have.

provided no compelling argument

How many filesystems have Merkle trees? You need something like them to avoid phantom reads, phantom writes, and silent corruption.

How many filesystems have duplicate metadata blocks, duplicate [what's analogous to] the superblock several times, and can duplicate data a user-specified number of times? And then check their validity using the Merkle tree property above to validate reads?

How many filesystems offer free and instant snapshots? As many as you want? Those things are wonderful for databases.

How many filesystems offer software RAID? Hardware RAID is a dodgy idea, because it's a complex binary blob in firmware you have no insight into when something goes wrong (speaking from bitter experience, things go wrong). Furthermore some hardware RAID suffers from a write hole.

How many filesystems are transactional? And allow you to roll back if a transaction becomes unfixably corrupted? How many can replicate? How many use SSDs efficiently? How many have been in heavy industrial use for years?

ZFS has all that (not some of it, that's the point), and more. There's nothing else like it. btrfs probably will be one day as well, but not yet.

So, no, it's not ridiculous. I've been down this trail of tears before, and ZFS has made life so much better. At least I don't need to dread a number in my database silently flipping a digit anymore -- if that scenario doesn't give you the hives, then I really don't know what to say.

contingencies · on Sept 18, 2013

Many things can corrupt your data ... outside of the filesystem. You seem unswervingly fixated on ZFS for some reason. This is simply wrong. If there's any forest-missing going on for tree fascination, it's with you.

dmpk2k · on Sept 19, 2013

I make an argument in good faith and get a nonsensical passive-aggressive blow-off in return. You should be ashamed.

contingencies · on Sept 19, 2013

I fully recognize ZFS's great feature set, it's just a tool though, and only represents one potential solution, appropriate for certain requirements, within one layer of a storage subsystem. If paranoid levels of data integrity are an end-to-end requirement, ZFS isn't a magic bullet.