Age: Modern file encryption format with multiple pluggable recipients

FiloSottile · on Feb 26, 2023

_o/ hi all, age author here!

The OP link is the spec, here's a few other things you might find interesting

- the Go reference implementation https://age-encryption.org

- the Go library docs https://pkg.go.dev/filippo.io/age

- the CLI man page https://filippo.io/age/age.1

- the large reusable test suite (which I should write about!) https://c2sp.org/CCTV/age

- an interoperable Rust implementation by @str4d https://github.com/str4d/rage

- a YubiKey plugin by @str4d https://github.com/str4d/age-plugin-yubikey

- the draft plugin protocol specification (which we should really merge) https://github.com/C2SP/C2SP/pull/5/files?short_path=07bf8cc...

- a Windows GUI by @spieglt https://github.com/spieglt/winage

- a discussion of the authentication properties of age https://words.filippo.io/dispatches/age-authentication/

- a discussion of a potential post-quantum plugin https://words.filippo.io/dispatches/post-quantum-age/

- a password-store fork that uses age instead of gpg https://github.com/FiloSottile/passage (see also: how I use it with a YubiKey https://words.filippo.io/dispatches/passage/)

(Yes I should make a website to collate all this.) Happy to answer any questions!

aborsy · on Feb 26, 2023

I have concerns about the symmetric encryption with a 256 bits password in Age.

The File Key is 16 bytes, or 128 bits. So, although ChaCha20-Poly1305 can protect data with 256 bits of security, the symmetric encryption in Age is essentially 128 bits. This is sub-standard. In particular, files encrypted with Age may be decrypted with quantum computers in the future (for example cloud backups). There is little reason to use 128 bits symmetric encryption these days when 256 bits is far more secure and almost equally performant.

I understand the file format is designed around 128 bits asymmetric encryption. Still it would be great if the format/software could be updated so that the symmetric encryption with a 256 bits password is 256 bits (not silently falling back to 128 bits). At least, this issue should be clearly stated in the front page.

FiloSottile · on Feb 26, 2023

I've responded to these concerns in my reply to the sister comment. https://news.ycombinator.com/item?id=34949197

FWIW, no cryptographer I know would call 128 bits "sub-standard" and NIST itself is of the opinion that quantum computers will not break 128 bit keys. The links in my other comment get into more details as to why.

I want to mention something about "this issue should be clearly stated in the front page". I've seen this suggestion made many times about many projects, and I disagree with it almost universally, aside from the fact that it's often made about non-issues, such as in this case. Security (and security-relevant) tools should not have issues that need communicating to users, and if they did they should fix them, not delegate a choice to a user. Imagine if every project came with a list of potential issues you have to form an opinion on, and that you're not allowed to complain about because you were warned, after all. It's not a solution! If I thought this was an issue, I would produce a v2 of the spec, and guide users through migrating, instead.

aborsy · on Feb 27, 2023

Thanks for the clarification.

To sum up for people here, the Grover’s algorithm can’t be easily parallelized. Operating serially, it’s difficult to implement. Hence, it may not reduce the security of 128 bits to 64 bits; it’s less effective in practice.

Keep in mind that NIST doesn’t recommend symmetric keys below 112 bits. So the margin with 128 bits is low. I give you an example. If the user’s random number generator isn’t perfect, the file key in Age will contain an entropy less than 128 bits, which quickly gets you into an uncomfortable area. You should also take into account small reduction in security due to new attacks and speed ups. I am not a cryptographer; a cursory look at Grover’s doesn’t cut it. If I write crypto software, I will err on the side of ignorance and be conservative in my design.

As for the cryptographers’ opinion, I think the industry standard for encryption of data data at rest is AES-256. The acceptable range is 128–256, but 128 is the low end of the range, the recommendation is 256 bits which is what most companies use. Sure, 100 bits may not be breakable now or in short term, but nobody uses 100 bits for that reason (note that Age uses 10 words in its default, which if using bips list, is 110 bits).

Lastly, you should not forget compliance. Top secret information is often required to be encrypted with 256 bits (see NSA recommendations).

FiloSottile · on Feb 27, 2023

> note that Age uses 10 words in its default, which if using bips list, is 110 bits

That's correct. The passphrase is then fed to scrypt with N = 18, so the total number of operations required for a conventional key search that enumerates possible passphrases is 2^128. I do wonder how hard scrypt is to implement in a QC, I'll look in the literature or ask. Even if scrypt costed zero gates, which seems unlikely since memory accesses are not free, 110 bits Grover would still require ~2^112 gates at MAXDEPTH 2^40.

> I think the industry standard for encryption of data data at rest is AES-256.

Discussions about what is or isn't "industry standard" are hard to have productively because there is no authoritative or technical answer, but I will point out that mail.google.com and Chrome use 128 bits ciphers to talk to each other. Data at rest is not more sensitive than data in transit, the latter can just be recorded by an attacker and cracked in the future.

I agree that compliance usually prefers 256 bits.

2h · on Feb 27, 2023

wow. this is such a bad take, it makes me question using Age at all.

> tools should not have issues that need communicating to users, and if they did they should fix them, not delegate a choice to a user.

no tool is perfect, including Age. and no issue is fixed immediately, or at all. so instead of waiting months or years for an issue to be fixed, it can be quite helpful to throw a quick warning on the front page, so at least people are warned in the meantime. Its just a sign of respect, and of not wasting the users time.

> Imagine if every project came with a list of potential issues you have to form an opinion on

I would be glad if more projects included important to-dos or shortcomings up front, so that I dont have to discover them myself after a week or month of use. Further, no one is forcing users to form an opinion, or even read the known issues. you throw them in a section at the bottom the of README, then they can read if they want, and form and opinion if they want.

without doing this, you're essentially lying to the userbase, as you are aware of issues, perhaps even important ones, that are not fixed, and are purposefully hiding that information, or at least not being as forthcoming with it as you could be.

d-z-m · on Feb 27, 2023

> wow. this is such a bad take, it makes me question using Age at all.

I would think it would bolster your confidence in age, as the maintainer is expressing a commitment to fix issues as they arise, not punt indefinitely with TODOs at the bottom of a README.

> > tools should not have issues that need communicating to users, and if they did they should fix them, not delegate a choice to a user.

You stripped important context from the beginning of this quote, the original reads:

> Security (and security-relevant) tools should not have issues that need communicating to users, and if they did they should fix them, not delegate a choice to a user.

Which seems reasonable.

loup-vaillant · on Feb 27, 2023

You might want to read this paragraph from @FiloSottile:

> The only reason to use more than 128 bits of key is to protect against multi-user attacks. […] There are two ways to protect against that: larger keys or nonces. age uses a 128-bit per-file nonce fed into HKDF, making the total search space 128 + 128 = 256 bits, safe in every multi-user scenario, too.

This is big. Without those multi-user attacks Age is much safer than I initially thought. The tiny security increase we'd get by going to 256-bit keys is not worth making a breaking change. Not by itself at least.

loup-vaillant · on Feb 26, 2023

Hi, something is bothering me here: https://github.com/C2SP/C2SP/blob/main/age.md#file-key

> Each file is encrypted with a 128-bit symmetric file key.

Did you meant 256 bits instead? Or is this a deliberate choice?

As far as I can tell the body is encrypted with a 256-bit key derived from the file key, and using a 128-bit file key would limit the security of the whole scheme when using it with X25519 (its ~128 bits of collision resistance is better than the 128 bits of preimage resistance of the file key), or with extremely strong passwords.

The only rationale for this choice I can think of is saving space, but your choice of a textual format with base64 encoded keys & nonces suggests you didn't care that much about space…

I'm confused here, what did I miss?

loeg · on Feb 26, 2023

> Did you meant 256 bits instead? Or is this a deliberate choice?

No; yes.

https://github.com/FiloSottile/age/blob/891be91d42721088db0d...

FiloSottile · on Feb 26, 2023

This is a common question! It's a deliberate choice.

The notion of security levels is somewhat in disuse. It's impossible to do any operation, even moving a single electron, 2^128 times so anything that actually has 128 bits of security is "secure enough" for any requirement. This is not a new idea, see agl's post from 2014. [https://www.imperialviolet.org/2014/05/25/strengthmatching.h...] There are arguments that boil down to "more is always better" but I am unconvinced by them as they could be used to argue for 512, 1024, 4096 bit symmetric keys as well, why stop at 256 bits. Part of what changed is that fundamental cryptographic primitives break a lot less than they used to do. "Too much crypto" is a good read about this. [https://eprint.iacr.org/2019/1492]

The only reason to use more than 128 bits of key is to protect against multi-user attacks. That's a situation where an attacker is trying to break one of a very large number of ciphertexts or keys, and would be satisfied with breaking any of them. If you are trying to break one of 2^52 ciphertexts encrypted with 128-bit keys, you can theoretically do it in 2^76 time, which might be doable! (Major asterisks, like that they all need to have encrypted the same plaintext, but anyway.) There are two ways to protect against that: larger keys or nonces. age uses a 128-bit per-file nonce fed into HKDF, making the total search space 128 + 128 = 256 bits, safe in every multi-user scenario, too.

Why use a nonce and not a bigger key? That works out to the same file size overhead! The difference is that the key is repeated for every recipient while the nonce is only serialized once. That means that if a file has 65 recipients, this will let us save about a kilobyte. Is it worth a lot? Not really, but it was free. It also makes most stanza bodies a single line, which is nice.

There's also a misconception that 128 bits are not enough for post-quantum resistance. I also thought that, but it turns out to be based on a simplistic understanding of Grover's algorithm. I wrote about it in the context of age specifically [https://words.filippo.io/dispatches/post-quantum-age/#128-bi...] but if you don't trust me you can also check the very last FAQ on NIST's PQC page. [https://csrc.nist.gov/Projects/post-quantum-cryptography/faq...]

(I am not sure what you mean by X25519's "collision resistance" vs the file key's "preimage resistance". Those are hash function security notions and this is a more complex setting. As you know, X25519 is a key exchange algorithm, not a hash function, and ~128 bits is the amount of work required to reverse the private key into the public key. Anyway, to get a collision in the derived file key, both the key and the nonce would have to collide (128 + 128 = 256 bits) so the "collision resistance" of the whole age scheme is still 128 bits.)

loup-vaillant · on Feb 26, 2023

Thank you for your quick answer.

> I am not sure what you mean by X25519's "collision resistance" vs the file key's "preimage resistance".

There's a significant difference, not only with multi-user attacks (which I see you know of), but with your chances of success with scaled down brute force attacks. If you divide your key search effort by 10 your chances of success are divided by 10 (linear drop off). But if you divide your hash collision (or ECC brute force) efforts by 10 your chances of success are divided by 100 (quadratic drop off). [1]

Those two reasons are why I believe Daniel J. Bernstein when he says that Curve25519 is "high security" even though he has serious concerns with 128-bits AES. [2]

[1]: https://loup-vaillant.fr/tutorials/128-bits-of-security

[2]: https://cr.yp.to/snuffle/bruteforce-20050425.pdf (Understanding Brute Force)

FiloSottile · on Feb 26, 2023

> If you divide your key search effort by 10 your chances of success are divided by 10 (linear drop off).

Right, but if the starting point is 128 bits it doesn't matter how much you can parallelize the attack, you're not going to find a key regardless of how you divide the key space, even if the chances drop off only linearly.

Again, this is about "good enough". 128 bits of collision security are "better" than 128 bits of "preimage security" (I've never heard resistance against encryption key brute force called preimage security, but I think I get it) in the same way that 256 bits of collision security are better than 128 bits of collision security, or any bigger number is better than a smaller number. Cryptography needs to be secure, not as big as possible.

loup-vaillant · on Feb 26, 2023

> Right, but if the starting point is 128 bits it doesn't matter how much you can parallelize the attack, you're not going to find a key regardless of how you divide the key space, even if the chances drop off only linearly.

If you combine a parallel attack with a scaled down search, a standard parallel attack can give a state-level attacker chances comparable to the lottery. Very low, just not quite impossible. Enough to protect your wallet, or even your life, perhaps a tad low to protect something like Wikileaks. (Perhaps. the US has shown they have other means.)

> I've never heard resistance against encryption key brute force called preimage security

I'm trying to coin the term, for lack of anything better. As far as I am aware there are two broad classes. One includes key searches and hash preimage attacks. The other includes hash collisions and discrete logarithm problem (without the help of index calculus).

If you have a better term for key search/preimage attack that is more readily understood by readers I would try to use that instead.

> any bigger number is better than a smaller number.

Sure, the choice is what threshold we might use. To do this it might help to get an upper bound. I've read an article stating that a perfect computer operating at space temperature would require the energy of something between a nova and a supernova to explore all the configurations of 256 bits. This is more than the entire output of our sun, so we know that there's no point in ever going higher.

On the lower bound section we have the number of hashes performed by the Bitcoin network. Their peak right now seems to be around 370 Exa-hashes per second, or about 2^93 hashes per year. It thus seems reasonable to never go below 93 bits of security (of any kind).

We can debate the exact numbers. My feeling is that anything above 192 bits is high enough to be utterly boring, and anything below 100 bits is too low for anything high stakes. Between the two however I'm forced to agree with you: any bigger number is better than a smaller number.

upofadown · on Feb 27, 2023

>...about 2^93 hashes per year.

128-93=35 bits. 2^35=34 billion years which is significantly longer than the age of the universe. That seems to support the contention that 128 bits is simply not brute forceable at the chances of a lottery or at all.

100-93=7 bits. 2^7=128 years. So 100 might be a bit high for a reasonable lower bound, but is at at least reasonable assuming some sort of breakthrough in computing.

loup-vaillant · on Feb 28, 2023

Crap, good catch… I somehow managed to pretend 2^35 = 35. Oops.

That being said, one chance in 34 billion is only 3 orders of magnitude from typical lotteries, and in settings where multi-key attacks are applicable (not Age), we can definitely achieve lottery levels of success (with 1000 keys we get down to 1 in a 34 million chances for 1 year). 100 bits looks too low: in 1 year chances of success on a single key go up to 0.8%. That's pretty high.

We also need to keep in mind the actual algorithm used. When it's SHA-256 and Chacha20 we can say it's pretty comparable to Bitcoin, though there might be a small factor difference between hashing a full block and trying a key in a file. But if the entire path involves more hardware friendly primitives like AES, dedicated silicon could be quite a bit more efficient than the Bitcoin network currently is, and would drive down the costs (or drive up the chances of success) accordingly.

Now, do I actually believe a state level attacker would construct something as powerful as the entire Bitcoin network just so they have an extremely slim chance of cracking one key among many after years of burning energy over the search? No. 128-bit keys are safe, even in the face of multi-user attacks.

But the reasoning required to arrive to this conclusion is more complex than the one needed to assert that 256-bit keys are safe, because "there's still a chance". It's not plausible at all, but it remains humanly possible. With 256-bit keys we know it's flat out impossible. With 256-bit keys we don't even need to think, and that alone has some value.

Retr0id · on Feb 26, 2023

I think boring-ness is the crux of it, for me.

From what I've understood, 128-bits is fine-in-practice, unless you're an extremely high-value target who is also feeling unlucky. Even if I was a whistleblower under an authoritarian regime, I'd feel pretty safe using Age in its current form (at the very least, I'd have bigger things to worry about).

However, it's not quite boring enough - the fact that this conversation is happening at all attests to that fact. It would cost almost nothing to increase file key size to 256 bits, and so IMHO it should be considered for an "age v2" - but at the same time, not something people should be alarmed about in the current implementation. (with the caveat that I'm not exactly qualified to make that assertion)

loup-vaillant · on Feb 26, 2023

> IMHO it should be considered for an "age v2"

Definitely, though personally I wouldn't go as far as recommend we make a version 2 just for this. The downsides of such a change are not worth the arguably extremely slim security benefit.

(And I agree with Age author it's not worth warning users about either. Not now in 2023.)

Retr0id · on Feb 26, 2023

Right. I don't think this is a reason alone to make breaking changes, but if a new version is coming out for whatever reason(s), it should be bundled in with the other changes.

mytailorisrich · on Feb 26, 2023

256 bits is stronger than 128 bits and the extra cost does not matter for a tool designed to encrypt files.

It also saves a lot of time answering questions and justifying the choice.

So, IMHO, go for 256 bits and be done with it.

tptacek · on Feb 26, 2023

No, you can't really win here. If you reflexively bring your symmetric constructions to 256 bits, you end up in equally unproductive arguments about parity with your asymmetric constructions.

loup-vaillant · on Feb 26, 2023

In my experience, much fewer people object to the use of Curve25519 than they do about 128-bit symmetric keys. I believe one reason is that bigger curves have a significantly higher cost, while limiting symmetric encryption strength to 128 saves zero CPU cycles with Chacha20.

For the few who do object, explaining that there is no such thing such as parity between symmetric and asymmetric primitive can help: https://loup-vaillant.fr/tutorials/128-bits-of-security

mytailorisrich · on Feb 26, 2023

Yes, everyone will immediately pick on 128 bits for symmetric encryption but I suspect the vast majority of people won't pick on the fact that Curve25519 offers less bits of strength than 256 bits symmetric encryption, or they won't care. So you'll still avoid a lot of questions and have a better marketing box-ticking scorecard, will probably still be more secure.

tptacek · on Feb 26, 2023

I'm responding to a comment suggesting that designers elect for 128 bit primitives to head off unproductive debates. Suggesting that you have a strong preprepared answer for those debates isn't really responsive to what I'm saying.

loup-vaillant · on Feb 26, 2023

Sorry, I didn't mean to convey a strong disagreement here. I mostly agree with you on this.

Besides, we know the next step: use Curve448 to stave off criticism about using a small curve, and then people start talking quantum computers.

vbezhenar · on Feb 26, 2023

2560 bits is stronger than 256 bits and the extra cost does not matter for a tool designed to encrypt files. See what I did here? Where do we stop?

loup-vaillant · on Feb 27, 2023

We stop before the key is big enough to requires the energy of our sun to crack, even if you have an ideal computer. 256-bit keys require more energy than a nova.

128 bits however is humanly achievable (35 years worth of peak Bitcoin mining), and as such perhaps a tad low in really high stakes scenarios (say highly classified stuff).

We could chose something between 128 and 256, but they're nice powers of two, and we tend to like powers of two.

vbezhenar · on Feb 27, 2023

To enumerate 128 bits in 35 years you would need to make 3e29 attempts per second.

Current bitcoin miner can provide efficiency of 19 000 000 MH/J or 19e12 H/J. If we would assume bruteforcer can perform with the same efficiency, we would need 1.6e16 J/s = 16 000 TW.

Current humanity energy output is something like 17 TW.

loup-vaillant · on Feb 28, 2023

Yeah, botched my numbers, sorry. I saw that the hash rate of the current Bitcoin network went up to 2^93 hashes per year, and then I managed to pretend 2^128/2^93 = 35, instead of 2^35.

With your numbers if humanity dedicated all its energy to brute forcing a 128/bit key for a year it would have a ~0.1% chance of finding it. I guess that's a pretty good argument that no one is going to even try it in the foreseeable future.

See this cousin comment highlighting my error. that out: https://news.ycombinator.com/item?id=34955855

upofadown · on Feb 27, 2023

I got billions of years when I used the current bitcoin hashing rate for 128 bits. Are you assuming a birthday attack?

loup-vaillant · on Feb 28, 2023

No, I just botched it and pretended 2^35 = 35. See these other comments for more details:

https://news.ycombinator.com/item?id=34955855

https://news.ycombinator.com/item?id=34963094

mytailorisrich · on Feb 26, 2023

This is absurd. I am only comparing the two main options (more or less) for symmetric encryption: 128 or 256.

Edited as not AES but you get the point taking into account that AES256 is the standard.

tptacek · on Feb 26, 2023

Age doesn't use AES.

amluto · on Feb 26, 2023

> age uses a 128-bit per-file nonce fed into HKDF, making the total search space 128 + 128 = 256 bits, safe in every multi-user scenario, too.

If I read the description right (big if), age is using HKDF to derive a 128-bit key. In this case, it does not help at all that there’s a 128-bit nonce: the attacker will do a multi-user attack against the HKDF output, and the attacker wins (with asterisks, etc).

FiloSottile · on Feb 26, 2023

Nope, HKDF outputs are always 256 bits. They are used either as a HMAC-SHA-256 key or as a ChaCha20Poly1305 key, so everything downstream of HKDF is 256 bits.

In "Conventions used in this document" you'll find the sentence "The length of the output keying material is always 32 bytes." making this explicit.

amluto · on Feb 27, 2023

Gotcha. I was looking at the wrong little bit of code and was distracted by all the mentions of AES128.

amluto · on Feb 27, 2023

Actually, I changed my mind. Age uses the file key for at least three things:

1. HKDF with a random 128-bit nonce to derive the actual payload key.

2. HKDF with no nonce to derive the HMAC key for header authentication.

3. As input to every “recipient”.

I’m willing to believe that #1 is not subject to a multi-target attack and that this might even be provable.

I’m less willing to believe, without evidence, that #2 is safe. There isn’t an obvious (to me, on brief consideration) to, for example, tell whether two age files have the same HKDF output for the header key, but that’s only because the headers themselves are likely to differ for age files because the recipient stanzas are (presumably!) unlikely-to-collide functions of the file key.

I don’t like #3 at all. One could easily break multi-target security by having a DH recipient type that is just the identity to the power of file key. And I bet one could design a recipient type that looks reasonable on its own (and even has a security proof!) but breaks the system completely if two instances of the same recipient type are used in the same file.

If I were designing a v2, I would make two changes:

a) Either the file key is never used without a random nonce or the file key is >=256 bits, or both.

b) The recipient stanzas are not functions of the file key. Instead there is a per-stanza wrapping key that wraps the file key separately for each recipient.

The goal would be to enable a security proof that only needs to assume that each recipient stanza is independently secure.

Retr0id · on Feb 26, 2023

I can't answer the why, but I can confirm that it's not a typo, the file key is indeed 128-bit as-implemented.

However, the "payload key" is 256-bit, derived from the 128-bit file key via HKDF-SHA256. It's the "payload key" that keys the ChaCha20Poly1305 cipher, which encrypts the file payload itself.

raggi · on Feb 26, 2023

What was the motivation for introducing C2SP and CCTV as concepts? Are there immediate plans to introduce more than Age?

FiloSottile · on Feb 26, 2023

Yes, they are broader projects, although it's too early for a proper announcement. They make sense together, hence why CCTV is under C2SP, but they have different motivations.

C2SP is an experiment in producing specifications applying techniques from software development, including semantic versioning and the concept of a maintainer. It was borne out of displeasure with the pace and output of the IETF's CFRG, which is taking years to publish ready-made documents like ristretto255, and makes complex protocols with lots of sharp edges and moving parts, as well as being generally a pain to engage with.

CCTV is a place to pool test vectors, so we don't have to keep reinventing them for each implementation and we can cross-pollinate our test coverage. It's also a place to drop new vectors to test for newly found edge cases or issues. You can think of it as an open Project Wycheproof.

raggi · on Feb 27, 2023

Thank you!

d-z-m · on Feb 27, 2023

Classic McEliece age plugin...thoughts? ;)

raziel2p · on Feb 26, 2023

Why not just link to the actual code repository? https://github.com/FiloSottile/age

loup-vaillant · on Feb 26, 2023

As far as I am concerned, the specs convey more information in less time than the source code would have. Linking directly to them caused me to actually read them, and I'm actually glad I did.

I already knew of Age, so if this was a link to the repository root instead I would have jumped straight to the comments section, and missed a couple genuinely interesting design choices about it.

lazypenguin · on Feb 26, 2023

I ran into this recently when using https://github.com/mozilla/sops. It worked pretty nicely and as a tool was quite easy to get started with.

speedgoose · on Feb 26, 2023

Sops is great. I hope they will find new maintainers. https://github.com/mozilla/sops/discussions/927

vbezhenar · on Feb 26, 2023

Thanks for age, this is my go-to tool for any encryption tasks I need. I'm using it to encrypt my secrets in repo, backups, it's incredibly convenient. I used to write openssl invocations with questionable security, but age makes me more confident.

theK · on Feb 26, 2023

> The textual file header wraps the file key for one or more recipients, so that it can be unwrapped by one of the corresponding identities

Isn't that what LUKS does as well? What's the novelty in this one?

Macha · on Feb 26, 2023

My impression is that the novelty is less about the encryption format and more about the tools and ecosystem being understandable to anyone that has used ssh before (and indeed, in the case of the reference implementation, even letting you use straight up ssh keys)

fbdab103 · on Feb 26, 2023

Indeed, ease of use is almost more important than security. If the tooling is so hard to use correctly that everyone makes a mistake, users gain little. The interface to age seem so straightforward that as long as the crypto primitives are solid, this would do a lot for making encryption more accessible.

Edit: correct an English gaff

theK · on Feb 26, 2023

Sorry, I kinda missed that. The "multiple recipient" thing is quite prevalent in the titles (here and on github) so I thought this was supposed to be the key differentiator.

hamandcheese · on Feb 26, 2023

I'm not sure there is any specific feature that is novel, nor do I think any claims as such were made. The novelty is packaging it all together with an easy to use, hard to misuse UX.

waynesonfire · on Feb 26, 2023

Nice call out.

I've seen age come up a few times on hackernews. It's trying so hard to dislodge gpg but I just can't see that happening for me. In order to replace gpg, it would have to be 10x better and that's quite a leap for an encryption tool that requires stability, compatibility.

Overcoming the gpg learning curve is do-able. It's worth it. You'll thank yourself in 10 or 20 years when you need to decrypt a file. Get a Yubikey as well, or two.

fbdab103 · on Feb 26, 2023

I am happy to change with the times and adopt modern tools. Something without all of the sharp edges that make encryption more difficult in practice.

If nothing else, by being written in a memory safe language, age is unlikely to ever have a comparable list of CVEs[0] as GPG. A code execution vulnerability was discovered in GPG as recently as this year. Such vulnerabilities could of course still happen in age, but they are always going to be less common.

[0]: https://www.cvedetails.com/vendor/4711/Gnupg.html

loup-vaillant · on Feb 26, 2023

As the author of a non-negligible amount of peer reviewed cryptographic code, I happen to know a bit about how much safe languages help you. And the answer is, "less than you'd think".

Modern cryptographic primitives benefit very little from safe languages like Rust, compared to using C and running the test suite under Valgrind and Clang sanitisers (all 3 of them). Undefined behaviour related errors are stupidly easy to test: no heap allocations, no input dependent branches or indices, pathologically straight-line code, all that makes it fairly easy to implement an extremely effective test suite that is almost impossible to circumvent… except for logic bugs. In fact, the only significant bug I've ever shipped to production was a logic bug Rust or Go wouldn't have saved me from.

The real problems start when you start tackling encodings & textual formats. Those are definitely gnarlier, and I personally wouldn't trust myself with an ASN-1 decoder in C. This can be avoided by using simple binary formats instead, but since Age has decided to be a little more advanced using a safe language is definitely a plus.

Retr0id · on Feb 26, 2023

As a text format anti-fan, I recently took a stab at defining a binary file format that's semantically equivalent to Age (more or less) [1]. It's currently just a draft/prototype/PoC, although I'd be interested to get some more eyes on it. Conceptually, it should be possible to implement compatibility with Age's plugin ecosystem, although I haven't done that yet (and I haven't thought about potential cryptographic implications of doing so, either).

I've written a bit more about my rationale and why I'd prefer a simple binary format here [2] - notably, I found two text parsing bugs (of low/moderate severity) in existing Age implementations.

[1] https://github.com/DavidBuchanan314/bagel

[2] https://github.com/FiloSottile/age/discussions/497

FiloSottile · on Feb 26, 2023

Hi! I read and appreciated your issues and discussions, sorry I didn't get to respond to them yet, but I've been thinking about it.

Although I don't disagree that parsing text is hard, I also think that parsing variable-size binary formats is hard (and there is a tall, tall pile of bugs to confirm that). Really, parsing is hard. Rather than count on one design or the other to be bug-proof, I worked on a large test suite to help implementations catch their parsing bugs. [https://c2sp.org/CCTV/age] I think it would have found one of the issues you reported if that implementation had integrated it, and I am going to add vectors for various resource exhaustion scenarios which I hope would have found the other. (I am not going to look at what it is exactly, so I will know if I made the suite comprehensive enough without being too specific about this bug.)

I also liked your observation that it would have been nice if the header was streamable. [https://github.com/C2SP/C2SP/issues/28] It went on the pile labeled "regrets / for v2 when it comes", thank you.

loup-vaillant · on Feb 26, 2023

Just read it, I like it. I have a couple simplification opportunities:

Instead of having variable length types you could just encode them in 2 bytes magic numbers (or even 1). That should leave you plenty of different types to work with, make the encoding smaller, and is one less variable width field to deal with.

The second one I'm not sure about: starting the header with the length of the entire header, instead of marking the end with a special entry. The hope here is that recipients would know right of the bat how long the header actually is, and read it right away. They'd also know where to begin the actual decryption instead of scanning for the last special stanza.

The reasons I'm not sure about this last one because are (i) I don't know if this will actually simplify your code, and (ii) the simplest implementations would effectively limit total header size.

Retr0id · on Feb 26, 2023

I did consider magic numbers for the types, however, the current version of Age is extensible without active coordination - that is, if you're designing a custom recipient type you call it "example.com/whatever", and you can be pretty sure it's globally unique, whereas with a small magic number you might collide with someone else's extension.

I didn't use a overall header length for two reasons:

- Having length fields that potentially "overlap" introduces parsing edge-cases. For example, what happens if the final recipient "overflows" the length specified earlier? Obviously you'd add bounds checks for this, but another implementer might forget!

- An implementation should scan over all the recipient fields to check that the file is well-formed before it attempts decryption. Not including the length forces parsers to do this. I don't think there's meaningful overall performance to be gained by skipping ahead anyway (since time spent doing X25519 or scrypt will dwarf the time it takes to do the parsing).

- Reason 2b - it should be possible to parse the file in a fully streaming fashion, with bounded memory usage (no dynamic allocations), without needing any hard limits. I might do a C implementation that demonstrates this concept.

loup-vaillant · on Feb 26, 2023

> Having length fields that potentially "overlap" introduces parsing edge-cases.

Oh, that's a very good point actually. I'll keep that in mind, thanks.

> An implementation should scan over all the recipient fields to check that the file is well-formed before it attempts decryption.

Ideally an implementation should authenticate the whole header.

This means appending an authentication tag to the header, computed with from the file key. Of the top of my head, I would derive a header key and a payload key from the file key, using either a stream cipher, a hash, HMAC, or HKDF expand. Then I'd use the header key to authenticate the header. Probably using a keyed hash or HMAC to get key committent and avoid partition attacks down the line. Or, if key commitment is handled in the stanza themselves, with a fast polynomial hash.

> I might do a C implementation

In my opinion file formats should have a C implementation whenever possible: if a language as weak and as unsafe as C can handle it without too much trouble, we know it's a simple enough format.

Retr0id · on Feb 26, 2023

> Then I'd use the header key to authenticate the header. Probably using a keyed hash or HMAC to get key committent

This is already the case :)

> In my opinion file formats should have a C implementation whenever possible

Yup, I totally agree. Although I haven't written any of the code yet, I've been architecting a hypothetical implementation in my head while designing the format.

tptacek · on Feb 26, 2023

The PGP learning curve is not the most important reason to move on from PGP; security is a much bigger issue with it. PGP is an archaic design, and archaic cryptography designs are not a good thing.

andrewmcwatters · on Feb 26, 2023

When I worked at a company called Smarsh, I wrote an encrypted file format for them for use in cloud agnostic storage utilizing envelope encryption. I read dozens of state of the art documents, various RFCs, and researched even proprietary formats for prior art references.

We also used Go, which I consider the best possible language to use with respect to working on encryption. Kudos to Filippo Valsorda for his work on Go, and age. Hands down the best work in the space at the moment. It doesn't get better. I've done my homework.

But in my opinion, this is still an unsolved problem. age simply makes a dent in an otherwise immature space, with simple questions:

1. What cyphertext, salt, iv structure layout should the encrypted payload follow? There are de facto binary structure layouts that prior art has established a practice for, but it does vary in order, and no papers talk about embedding specific details you need for the decryption process such as required lengths that otherwise need agreed upon static values.

2. What headers should be provided, if any, to provide mechanisms for versioning and reading encryption strategy? Should part of this metadata be in the file extension instead, such as `.aes256-gcm' and `.aes256-gcm.key'?

To my surprise, despite the longevity of the industry, there was no simple file format that I could turn to that didn't involve being tightly coupled to email or a concept of a recipient.

Unfortunately, age was:

1. Too young to consider for enterprise rollout.

2. Failed to differentiate itself from previous standards that were tightly coupled to email or recipients.

3. It fails to meet encryption standards that are more aligned with compliance than the state of the art.

Regarding point 3, I appreciate its desire to stay small and utilize ChaCha20-Poly1305, but that's insufficient for deployments with specific requirements.

I think someone needs to come out with an alternative to age that allows you to specify the cipher, and supported ciphers in the standard need to have specific binary layouts that can be agreed upon.

I would personally prefer some metadata in the file extension, but I think this is naive, because re-encrypting a file would lose this information, or dumping binaries to pathnames with arbitrary file names would destroy this information, so it must be in a file header.

The file header needs a specific magic number that can be read by utilities, and those utilities can further probe the file for metadata.

age does some of these things, but it doesn't have answers for others.

But gpg isn't a solution, either. Nor are numerous other encrypted payload formats for various reasons.

FiloSottile · on Feb 26, 2023

Thank you for the kind words about my work on Go!

I think there are no standards for the things you mention because there shouldn't be any. If they existed, they would be necessarily complex and bloated like JWT or XML, introducing unneeded complexity in anything that adopted them. age tries to be as much complexity needed for the use case and no more. Other designs can make their different choices for their different use cases. That's good!

No contest on (1). On (2) I want to mention that "recipient" might have been an unfortunate choice of word, but it has nothing to do with email, it just means "a thing that can be encrypted to" like a public key, and any envelope encryption scheme will have something like that even if it doesn't give it this or any name. For (3) I have no idea what your requirements are, but I would claim age is pretty much state of the art (partially because it's not hard, age is trivial by cryptography research standards). Wireguard is being merrily deployed everywhere and uses ChaCha20-Poly1305 as well, FWIW.

(age does have a magic number in the header, and I see you came to realize why metadata in files is a bad idea. It's a bad idea also because it would be attacker-controlled.)

[Edit: s/No context/No contest/]

andrewmcwatters · on Feb 26, 2023

A part of the issue is that we tried to avoid utility deployment in parallel with the micro services we were already building.

So we emphasized trying to focus on the data itself first, and as a result what standards we could rely on, anticipating that we would use some implementation in Go.

openssl-enc doesn’t output to a file standard. It has known binary layouts but they’re not for use other than within the utility. `openssl enc’ was not something we were going to utilize for clients’ billions of files nor was replicating the binary layout for compatibility.

At the time we deployed age had been released for about two weeks, IIRC. Not something I as a principal can take to management and say we’re going to use for some of the largest clients in the world. Ironically, we could take on the risk internally with our own research.

But we needed cipher compatibility with the KMSs we were working with. age doesn’t provide that when everyone else is speaking aes-256-gcm.

FiloSottile · on Feb 26, 2023

That makes perfect sense, two weeks is way too early to deploy something in production.

Also, good call on not using openssl(1) in production. Last time I checked that CLI was primarily meant for testing, and anyway is full of sharp edges.

Not sure what AES-256-GCM vs ChaCha20Poly1305 has to do with KMSs though? I ask because age is specifically designed to support pluggable key wrapping mechanisms to support KMSs. You can write a plugin that talks to your KMS to wrap the file key, and use age for everything else. Surely you're not sending the whole file payload to the KMS.

andrewmcwatters · on Feb 26, 2023

No, we simply didn’t consider the option you have mentioned.

FiloSottile · on Feb 26, 2023

Can I ask what the KMS is, for my informal plugin prioritization/planning? If you prefer to email it, it's hi at my domain, filippo.io.

waynesonfire · on Feb 26, 2023

i've seen this a few times now. to replace my usage of gpg, it would need to be 10x better. it just isn't.

thadt · on Feb 26, 2023

How about 70x smaller?

cd gnupg-2.4.0; find . -name '.c' | xargs wc -l

350949 total

cd ../age; find . -name '.go' | xargs wc -l

5089 total

There are all kinds of differences between C and Go code, and it's also not fair to compare newer and older projects. But the reality is that I can take an afternoon stroll through Age and see pretty quickly what it is doing and how. As an aside, the very thing I find useful about Monocypher. Hat tips to Loup and Filippo on the way out.

seized · on Feb 26, 2023

Such a profoundly helpful contribution.... How about detail on why it's not better than gpg?

masklinn · on Feb 26, 2023

Usually the complaint is that it doesn’t do the billion things gpg does.

Which is exactly why I use age, the thing’s so simple there’s no way I can fuck it up, and I can plug in an ssh pubkey straight from github.

seized · on Feb 26, 2023

Exactly. Gpg is famously straight forward to use (/s).

loup-vaillant · on Feb 26, 2023

Is it even possible to be "10 times better" than GPG? I mean as far as I know GPG works, and for someone who has developed a simple & safe workflow around it, it would be hard for something else to do significantly better than that.

And I say that even though I'd rather write my own file encryption tool than spend even a minute learning how to use GPG.

d-z-m · on Feb 27, 2023

> and for someone who has developed a simple & safe workflow around it

This is the kicker. Modern versions of GPG have sane defaults, but if I were a new developer who knew nothing about encryption, I would be very scared of GPG, it's aged documentation, and multitude of encryption and signing methods.

Following the wrong stackoverflow answer, or an out-of-date blog post could easily get you a GPG configuration that is insecure in the year 2023.

The same cannot be said for age, it has no knobs that you can dial to an insecure setting. If you'll forgive the expression, it is "idiot proof".

Like you said though, GPG works, as long as you have that safe workflow.

jeroenhd · on Feb 26, 2023

The tooling is simple and easy to use, so it's already better than most PGP implementations in my opinion, especially for integrating within other software. I remember many programs failing to validate PGP signatures in the past because the only way they managed to reliably validate+decrypt messages was to call the gpg binary and parse the output, and by injecting output in the encrypted payload you could convince them that the message was legit.

This doesn't even attempt to replace PGP in cases where you need to deal with webs of trust or signatures, though you can build your own signature scheme by encrypting twice (after reading up on your cryptography of course).

I don't really understand what you mean by "10x better". It encrypts fast and securely, based on a readable specification. I don't know what it needs to do better to cover its intended use case.