As someone with zero knowledge regarding Zero Knowledge Proofs in a programming context, can someone give me a basic explanation regarding the utility? I do understand the basic principle of ZKP’s, but as yet I’m failing to understand how this would be applied in industry.
For me, the most powerful use of ZKPs is proof of the output of general purpose computations of any kind.
You can run an arbitrarily large, arbitrary long program, and whatever the program outputs, you can make a tiny proof-signature that says "this is the output you'll get if you run this program yourself".
The proof-signatures are relatively small, and you can verify them on small devices in milliseconds.
Another computer can trust the claimed output without having to run the program itself, by verifying the proof-signature.
This scales to arbitrarily large computations, so for example if a supercomputer says "I ran a quadrillion petaflops of your program for 1 year, and the result was the picture attached to this signature", you actually can verify that the picture is correct, quickly and efficiently - without having to trust the supplier.
It's as good as if you re-ran the program yourself (up to cryptography-grade probabilities, which is good enough).
Or if the big computer says "this entire Debian distribution of binary files was indeed compiled with this version of GCC", you can quickly verify that all the binaries are exactly what they should be - without having to trust anyone.
The proof process is rather slow, but it has gotten a lot faster over the last few years, and will continue to.
I was amazed when I learned that it's possible to securely check an arbitrarily large computation's output or result without running it yourself.
It was so counter to my intuition: it seemed like you would have to trust whoever makes the claim, or run it yourself. But you don't!
(So amazed and intrigued that I had to learn how it's done, and now part of my work these days is optimising the proof process.)
> Or if the big computer says "this entire Debian distribution of binary files was indeed compiled with this version of GCC", you can quickly verify that all the binaries are exactly what they should be - without having to trust anyone.
> So amazed and intrigued that I had to learn how it's done
Any chance you could just illustrate this somehow with a basic example? I just don't see how you could possibly verify that a program is produced with GCC without going through approximately as much effort as it'd take to compile it.
As far as I understand, you can't just use any gcc binary as it exists today. The program needs to be represented as a specific, mathematical expression that is suitable for zero-knowledge proofs.
There are RISC-V based zero-knowledge virtual machines, to which a GCC version can be compiled. So this should be possible, although probably very slow, maybe a thousand times slower than a normal GCC execution.
That was 500 KHz, and the latest versions are at 1MHz, so still 1000 times slower than a 1Ghz machine, but it’s easy to parallelize the workload, so it’s really mostly a cost of computation issue.
Is anyone actually using this to cache the artifacts of a compiler? Do you have a link? Like a proof of concept compiler that can produce both a binary and a proof that it was compiled correctly.
A toy example: suppose we have some sudoku. You want to show publicly (maybe in a HN comment) that you know the solution, without revealing the solution itself, because then anybody would know it and be able to post that they know it. A zero-knowledge proof enables this. You could also post a hash of the solution, but then you need to know the solution already to verify a submission. (It would also enable others to copy your answers without really knowing the solution, though that can be fixed using a technique that zero-knowledge proofs also use, a blinding factor).
More useful cases include decoupling payment information from users, to preserve their privacy. You can prove that somebody paid for the action you want to perform, without identifying the payer. For example to offer cloud storage without knowing which data belongs to which user, so when there is a data breach or law enforcement order, the answer to "tell me everything you know about user X" is their payment history, but not which data is theirs.
One place I wish there was zero knowledge proofs involved, or even any kind of cryptography, is when you perform credit assessment for loans outside your bank: an external loan provide peeks at your full bank account history to assess whether you’re eligible. They don’t need to know where I buy my socks, or even how much money I have. Only that I have a big enough deposit and a steady enough cashflow.
This isnt a cryptographic problem really. The loan checker is already trusting your bank to give them the correct information, it's only a matter of anonymization (e.g., they could return merchant types instead of merchant names etc.,.) but theres no real incentive for this.
Where you spend can have an impact on a decision… e.g. you may have the income and savings but if you’re regularly spending on gambling that can be a red flag.
If the loan assessment criteria are objective, they can be quantified.
The basic concept here is: ZKP lets you prove arbitrary statements.
Instead of:
Here is entire bank history, you decide.
You can say:
Had a fixed income above $X for 12 months.
Had a surplus of $X after fixed expenses in the last 3 months.
Did not buy anything irregular above $1000 in the last 3 months.
"Did not gamble" is a moral judgement. Who knows, maybe I'm buying gum at the local casino, is that gambling? Maybe I'm tossing a coin every night after work as to whether I should drive in the opposite lane, is that not gambling? You can only objectively measure financially risky behavior in statistical terms.
Think of a ZK proof as a program that can take both public and private knowledge as input, and produce public and private knowledge as output.
This is what seems magical to me: A program with secret input. You can't run the program to verify that my execution of the program is correct, but you can verify a proof that I ran the program with input you didn't have.
The way private knowledge works is through cryptographic commitments.
For example, the bank may start by giving you a signed, structured document with your transactions.
You can then feed their signature and the document to your program, and produce any derivation.
You can use the MCCs of each tx, then generating a zk-proof that shows none of the MCCs in your account match restricted categories.
This requires cooperation from the "bank", ideally providing Merkle trees to make sure no tx is missing in the proof like it would be for a blockchain-based solution.
this is the example often cited as usecase for zkTLS protocols, which are protocols that use additional trust assumptions to notarize a connection you have with a TLS protected host, and then you can prove the notarization (i.e. a signature over the TLS transcript) in ZK as well as any parsing over request/response data.
So if we imagine a very rudimentary social hierarchy with a government on top, then thousands of corporations below, and then millions of people below corporations, this feature protects people in a case when government is malicious, but every single corporation is benevolent. Now if the government is not malicious, but corporations are, even part of them, it will allow them to basically take any payment and refuse service or do any other variants of abuse, costing time or money (think how it is bad today, and make it worse). And there is nothing to be done with it, because payment chain information is broken. Which is very useful for criminals who would want to run some business unaccountable and outside of the law system, and not very "useful" for the regular people.
can this also be used in a session replay software? as in if someone from other team is trying to debug an app issue while watching a replay of the issue capture via DOM but is stuck because some PII data is not visible then can we implement this from user end ? like an OTP to access the PII but only on users consent?
I come from a traditional finance background. One underappreciated possible role for ZKP is in compliance.
Eg Goldman Sachs could encode all their compliance rules in a program, and publish a proof that their books pass the check by that program, without revealing anything about their accounting.
More crypto focussed: suppose you build a 'better FTX'. You could publish a proof that you ain't hiding an Alameda, ie that everyone who should have been liquidated actually got liquidated, and doesn't get special treatment.
In a banking context, you could in theory also run your know-your-customer (KYC) rules against customer provided data, store the proof, and delete the original data. That way, you still have proof that your customers don't have ties to North Korea or Russia, but you can't be compelled by anyone to reveal the data later (nor accidentally leak that data, etc).
Of course, for that latter application, you need a sharp lawyer to make sure that storing the proof instead of the original data is enough for your KYC obligations.
If you want to go further, you could have your customers run the KYC rules locally, so that their data never leaves their premises.
(For all these applications, you still have to have a mechanism that connects the real world to the inputs of the programs whose execution you are proving.
So eg Goldman Sachs would still need an auditor that checks that the assets and obligations they have in their balance sheet actually exist, but the auditor does not otherwise need to make judgement calls or apply any rules.)
I am building in the mortgage origination space and have these sorts of enhancements on the drawing board... you hit the nail on the head that the bottleneck will be in QC and legal review, as 3rd-parties (especially regulators) may want to manually see the data you used to reach the conclusions you did. Although I'm still digging. You'd be interested in what we're creating btw, it's a crypto-based mechanism that enables instant digital mortgage origination, we should chat!
The way I always explain it to newcomers now is to start from digital signatures.
Digital signatures are useful, we all know that, now imagine if you could sign not only data, but also computation result. As in “I ran this code with these inputs and it produced that output”.
If you imagine that this would work, and it takes less time to verify that signature than running the program myself, you have a succinct proof.
If in addition you can hide some of the inputs you used, then you have a zero knowledge proof.
So ZKPs are “stronger” signatures as they can sign more than data. Sometimes a signature is enough, sometimes you need more. Sometimes you need privacy so you verify a signature inside a ZKP :D
- anonymous signatures, authenticating that a whistleblower complaint comes from a real employee, without knowing who the employee is
- verifying your personal data without making it public. E.g. verifying that you're over 18, either black, disabled or low-income, revealing no other identifying information about yourself. This would require collaboration from the government and "compatible" ID cards.
- Blacklist handling, letting you comment anonymously on line, verifying that none of your previous comments have been banned for abuse.
The most common example in the talks I've been to have been for verifying anonymous voting amongst a group that you want to verify is valid to vote in the process. ZKPs allow for doing this without needing a central authority to attest to the person's credentials.
But it is early days and I think there's going to be many more use cases in the future around data privacy. Take an example of credit bureaus. What if instead of a lender sending over all the personally identifiable information needed to do a lookup it could instead send a ZKP to prove it knows enough information about an individual to be authorized to retrieve their record, meaning instead of sending SSN, DOB, Address, Phone, Name, they could instead just send enough specific values in the hash of a combo of some of those fields to prove that the full hash is known but without exposing the full hash itself (along with the existing shared secret to have authorization do lookup a value in the credit bureau in the first place).
I can see applications in multiplayer gamedev - imagine being able to run the whole game simulation on a clients machine and have them assert back to you that they killed 7 goblins, looted a rare sword from a chest, and died 3 times - and you could just trust them.
Your server costs would only need to be for the metaprogression/persistence related stuff that could be done relatively infrequently based on updates from the client.
I agree. Exploring this in game worlds came up in a job interview a few years ago :)
ZK proofs are potentially a transformative tool for real-tine distributed systems in general, not just games. They potentially improve laency ("ping"), by changing the communication patterns in a distributed consensus system. That's great for games and other real-time systems.
It would be amazing to have a single authoritative server in an mmo for the EU, USA, Asia, Oceania etc regions and only have bad latency if you were actually directly interacting with people from far away.
Imagine you are Goldman Sachs and a client wants to make a 100mm USD wire transfer to one of their accounts at Citibank. How does citibank know that the account at GS has the money to cover this transfer?
Right now, the way this works is essentially through a lot of trust and some guarantees by the fed. This has some downsides: because you need a lot of confirmations, it makes transfers take longer. Also, small players can't really get in on this system, so some regional banks are at a disadvantage.
How do you make this safer and more robust? GS obviously can't send info on all of its clients accounts and balances to Citi. You could imagine a protocol where the client/GS sends Citi a zkp to prove that the client has the money (as long as all inputs are agreed upon).
Of course, you don't really need zkps. You could also have the fed keep a database on all money in all accounts (like they do in Brazil), so that the bank only has to ask the central bank to give you an ok. But that is a whole lot of power in the hands of a central authority, as well as a single point of failure, which is something banking systems should avoid imo
> How does citibank know that the account at GS has the money to cover this transfer?
At the moment this is all handled with Swift, and I’m not sure you what you gain from adding ZKPs. Depending on the transaction you might send a Swift MT799 with a pre-advice letter, a proof of funds letter, or a blocked funds letter. Again depending on what you’re doing you might need a MT760 to send a bank guarantee or some sort of letter of credit, and finally a MT103 to initiate the actual transfer of funds.
At this point your counter party risk lies with the banking institution itself, and their willingness and ability to complete the transactions they have legally committed to, rather than the account holder, and this risk doesn’t go away with the addition of ZKPs.
But Swift is just a messaging protocol, right? It doesn't handle trust at all - like you said, you need an awful lot of documents for a single transfer.
I think what could be gained with a zkp protocol would be timeliness. Not needing to confirm if the client has funds in the other institution manually or from trusting their in house APIs would be pretty nice.
The Brazilian central bank has a system that does essentially that, and wires here (even for very large sums) take seconds to fill, instead of the usual 2 days for US interbank wires.
Swift is a messaging system, but it absolutely does manage trust. Swift messages can be used to transmit contracts and other financial obligations which are nonrepudiatable.
When using Swift, the financial institution crafts the content of the messages, some of which describe the state of their systems (like an account balance). So as a counterparty, you are trusting the institution, the jurisdiction the institution is based in, and the laws and enforcement in that jurisdiction.
If you introduce ZKPs, perhaps you could take the human out of the message authoring for some message types, but those messages would still be based on the state of systems controlled by that institution, and really a lot of the “trust” involved with Swift transaction is the trust that an institution will meet its future obligations (something ZKPs don’t help with at all). So the end result is that as a counterparty, you would still be trusting… the institution, the jurisdiction the institution is based in, and the laws and enforcement in that jurisdiction.
There are other payment systems that don’t have the same features that Swift has (like the ability to send bank guarantees, or proof of funds letters, etc…) like ACH and SEPA. But if those things are needed, you’ll just use Swift, or a different system entirely.
The delay in processing Swift transactions is also a feature not a bug for large institutions. If I send you a Swift payment for $100, unless one of us is on a watch list or something, it’ll just go through without any additional input required from either of us. But if I wanted to send you $1,000,000,000, at that level the banks want the opportunity to scrutinise the transaction for AML and anti-terrorism reasons. There is no definition of what a transaction that is involved in money laundering or financing terrorism looks like, so these checks cannot be automated in any way. If you want your transaction to go through, you have to answer whatever questions the bank officers ask, provide any material they ask for, and this can include literally anything they deem necessary. If the transaction is successfully completed it is not because you met some statutorily defined requirements, or somehow proved you weren’t money laundering or financing terrorism, it is because you convinced the appropriate bank officer that the risk of them being implicated in money laundering or financing terrorism was small enough to be acceptable in relation to processing your transaction. So ZKPs can’t help you here either.
“I do understand the basic principle of ZKPs, but as yet I’m failing to understand…”
Sounds you indeed have zero knowledge of zero-knowledge proofs. Congratulations!
If you want, I could prove to you that I know what zero-knowledge proofs are and how they’d be applied in industry, but you’d be no closer to understanding it. I would do it in a specific way that would basically impart zero knowledge to you, beyond the fact that I know what I’m talking about. Interested? :)
- Anonymous credentials (this is what Signal does) - maintain an encrypted blob representing a group chat (members list etc all stay encrypted and Signal cannot tell who is in a group chat). A normal client can provide a zkp that they are in a particular group chat (the decrypted blob contains this member for example) and have a message delivered to other group members. Both the client and the recipient can keep their identities encrypted and the zkp proves the membership of the plaintext client / recipient.
- Encrypt some metadata of a message sent to someone. You can build a ZKP that the plaintext behind the encrypted metadata satsifies some properties such as recipient is not in some blacklist (and so on). All this can be done by maintaining privacy because the metadata stays encrypted.
- Given an electronic medical record, you can prove that the record contains a vaccine without sending the record over the wire to some other party.
Lots more such ideas exist.
zkVMs are a good place to start playing with things.
It is currently possible to use ZKP's to set up via a central authority a digital cash system where the bank notes are all anonymous and all transfers are anonymous.
The central authority in this scenario cannot discriminate between transactions - any function that would compare two or more transactions cannot glean any useful information that would allow to discriminate. And and security of the anonymity of past transactions will be reducible to the security of the cryptographic hash function used (the next best thing to Information-theoretic security). As for forging money, depending on what ZKP approach is used even a quantum computer will be insufficient.
The central authority can still print money and can obviously shut the entire system down.
It is interesting to ponder whether or not some government will decide to take such a step and surrender all control (except for the nuclear option) over how their currency is used. It will certainly boost demand for the currency.
Do you have any recommended references on this subject? Seems like this sort of system would be able to obfuscate a lot of metadata that can be used to deanonymize activity. Very interesting.
Tornado Cash does this. And you can find articles on how it functions online. You can even read the smart contract that directly implements it.
Roughly: you have 2 secrets that you hash together and the central authority adds the result you disclosed to a list (either to print money or as part of a transaction to transfer money). To spend a note you reveal the hash of one of the secrets (to be added to a list of nullifiers to prevent double spend) and you do ZKP to demonstrate that you possess both of the secrets to *some* note from the public hash list and that the nullifier for that note is what you claim it is. Central authority rejects if nullifier is present in the list.
There are some other approaches to such a system, I believe the Tornado Cash one is the most elegant though it limits you to a discrete number of note denominations.
Note that the proof system Tornado Cash uses is not secure to a quantum computer and such a device will allow to "print money" - in reality, drain the smart contract.
ZKPs are used for private balances in Solana. Someone can send you a million PYUSD using confidential transfers and your public balance remains 100 dollars.