I've got files going back to 1991. They started on floppy and moved to various f...

globular-toast · 2025-03-18T09:43:34 1742291014

I have some going back to my first days with computers (~1997), but it's purely luck. I've certainly lost more files since then than I've kept.

Does that tear me up? Not one bit. And I guess that's the reason why people aren't clamouring for archival storage. We can deal with loss. It's a normal part of life.

It's nice when we do have old pictures etc. but maybe they're only nice because it's rare. If you could readily drop into archives and look at poorly lit pictures of people doing mundane things 50 years ago, how often would you do it?

I'm reminded of something one of my school teachers recognised 20+ years ago: you'd watch your favourite film every time it was on TV, but once you get it on DVD you never watch it again.

I think in general we find it very difficult to value things without scarcity. But maybe we just have to think about things differently. Food is already not valuable because it's scarce. Instead I consider each meal valuable because I enjoy it but can only afford to eat two meals a day if I want to remain in shape. I struggle to think of an analogy for post-scarcity data, though.

gtdawg · 2025-03-17T20:49:45 1742244585

What is your process for automating this checksum twice a year? Does it give you a text file dump with the absolute paths of all files that fail checksum for inspection? How often does this failure happen for you?

lizknope · 2025-03-17T20:53:35 1742244815

I run snapraid once a night and it has a scrub feature to read every file and compare against the stored checksum.

https://www.snapraid.it/manual

All my drives are Linux ext4 and I just run this program on every file in a for loop. It calculates a checksum and stores it along with a timestamp as extended attribute metadata. Run it again and it compares the values and reports if something changed.

https://github.com/rfjakob/cshatag

These days I would suggest people start with zfs or btrfs that has checksums and scrubbing built in.

Over 400TB of data I get a single failed checksum about every 2 years. So I get a file name and that it failed but since I have 3 copies of every file I check the other 2 copies and overwrite the bad copy. This is after verifying that the hard drive SMART data shows no errors.

gaius_baltar · 2025-03-17T22:25:05 1742250305

> What is your process for automating this checksum twice a year?

Backup programs usually do that as a standard feature. Borg, for example, can do a simple checksum verification (for protection against bitrot) or a full repository verification (for protection against malicious modification).