A list of hashes (tuple of [hashed url+date metadata, hashed content]) takes much less disk space than the archive contents themselves. Archive websites could publish the list for all their content so it can be compared against in the future. People would save copies of the list. If you didn't store the list yourself ahead of time, and don't trust a third-party to be "the source of truth", the archive could've uploaded the hashes to the blockchain at archive time:
https://gwern.net/timestamping