Reminds me of a random google SRE slide deck I stumbled across a couple of years ago. Switch itself corrupts payload; dense enough to pass the tcp checksum in use.
This. I work at a CDN and there's a particular switch model that we have a fair number of doing this. It was incredibly hard for the network team and vendor to track down. It's because one of the internal data paths was not error checked (i.e. hw design and layout problem). We're somewhat working around it with application layer stuff, but are moving off that switch and vendor as fast as capex allows. It wasn't a bottom of the barrel vendor either.
Nowadays people are switching from TCP to unprotected UCP to avoid the costly ACK dance, and not the other way round.
He also fails to mention how trivial CRC is to reverse (e.g. http://www.woodmann.com/fravia/crctut1.htm), and how good CRC32-C actually is to detect random bitflips. Much better than all other fast hash functions, and up to par with most slow and secure hash functions.
When more and more transport employ end-to-end authenticated encryption like TLS and SSH, this will be a non-issue. Message authentication code will resist even malicious tampering, let alone accidental corruption.
The suggestion at the end to use zip compression to protect your files seems funny; when we're talking about extremely rare errors, CRC32 seems insufficient to really protect.
Interestingly IPv6 removed the checksum from its packet header [1], delegating this work to the higher protocols such as TCP and UDP.[2] So I guess that raises the chance of an invalid IP header getting through. Source and target address will probably result in dropped packets but I wonder what happens if one of the other fields is corrupted, say traffic class.
The "workaround" has nothing to do with Ethernet. Because Ethernet isn't reliable enough at detecting errors higher-level protocols now use better error detection mechanisms of their own.
The error estimate that "between 1 in 16 million and 1 in 10 billion TCP segments will have corrupt data and a correct TCP checksum" is from "Performance of Checksums and CRCs over Real Data" [Stone and Partridge] which only analyzed a particular type of framing error over ATM. More modern transports should be immune to this form of error because the line encodings used make it nearly impossible to start a packet in the wrong place or include a fragment of the following packet.
http://www.catonmat.net/blog/wp-content/uploads/2008/11/that...