More

nigeltao · 2026-01-24T00:07:23 1769213243

> Russ Cox should make a C version of his code.

https://github.com/rsc/fpfmt/blob/main/bench/uscalec/ftoa.c

vitaut · 2026-01-24T03:18:01 1769224681

Note that it has the same table of powers of 10: https://github.com/rsc/fpfmt/blob/main/bench/uscalec/pow10.h

nigeltao · 2025-10-23T23:10:20 1761261020

The top-level README has a link called "Getting Started".

nigeltao · 2025-10-23T22:57:26 1761260246

See https://github.com/google/wuffs/blob/main/doc/related-work.m...

> Kaitai Struct is in a similar space, generating safe parsers for multiple target programming languages from one declarative specification. Again, Wuffs differs in that it is a complete (and performant) end to end implementation, not just for the structured parts of a file format. Repeating a point in the previous paragraph, the difficulty in decoding the GIF format isn't in the regularly-expressible part of the format, it's in the LZW compression. Kaitai's GIF parser returns the compressed LZW data as an opaque blob.

Taking PNG as an example, Kaitai will tell you the image's metadata (including width and height) and that the compressed pixels are in the such-and-such part of the file. But unlike Wuffs, Kaitai doesn't actually decode the compressed pixels.

---

Wuffs' generated C code also doesn't need any capabilities, including the ability to malloc or free. Its example/mzcat program (equivalent to /bin/bzcat or /bin/zcat, for decoding BZIP2 or GZIP) self-imposes a SECCOMP_MODE_STRICT sandbox, which is so restrictive (and secure!) that it prohibits any syscalls other than read, write, _exit and sigreturn.

(I am the Wuffs author.)

corysama · 2025-10-24T03:04:30 1761275070

Wuffs looks pretty awesome. Thanks for making it.

Wuffs is intended for files. But, would it be a bad idea to use it to parse network data from untrusted endpoints?

nigeltao · 2025-10-24T03:39:50 1761277190

It's a great idea. Chromium uses Wuffs to parse GIF data from the untrusted network.

There's also a "wget some JSON and pipe that to what Wuffs calls example/jsonptr" example at https://nigeltao.github.io/blog/2020/jsonptr.html#sandboxing

nigeltao · 2025-09-16T03:55:02 1757994902

I like my colleague Simon Morris's observation about software complexity:

> Software has a Peter Principle. If a piece of code is comprehensible, someone will extend it, so they can apply it to their own problem. If it’s incomprehensible, they’ll write their own code instead. Code tends to be extended to its level of incomprehensibility.

nigeltao · 2025-08-06T00:25:51 1754439951

> Compressing QOI with something like LZ4 would generally outperform PNG.

https://github.com/nigeltao/qoir has some numbers comparing QOIR (which is QOI-inspired-with-LZ4) vs PNG.

QOIR has better decode speed and comparable compression ratio (depending on which PNG encoder you use).

QOIR's numbers are also roughly similar to ZPNG.

nigeltao · 2025-05-27T00:30:26 1748305826

For Wuffs, top level declarations start with either pub or pri (and both keywords have the same width, in a monospace font).

    pub status "#blah"
    pub struct foo(etc etc)
    pri func foo.bar(etc etc)

Since code is also auto-formatted, you can do things like "show me a structural overview of a package's source code" with a simple grep:

    rg -N ^p   std/jpeg/*.wuffs

If you want just the exported API, change p to pub:

    rg -N ^pub std/jpeg/*.wuffs

nigeltao · on March 13, 2025

Might be relevant to your interests:

https://nigeltao.github.io/blog/2022/gamma-aware-ordered-dit...

nigeltao · on Dec 9, 2024

You might like JWCC, which literally stands for JSON With Commas and Comments.

https://nigeltao.github.io/blog/2021/json-with-commas-commen...

nigeltao · on Dec 8, 2024

JWCC literally stands for JSON With Commas and Comments.

JWCC is also what Tailscale call HuJSON, as in "JSON for Humans", which as amusingly also what json5 claims to be.

https://github.com/tailscale/hujson

lioeters · on Dec 9, 2024

There's also HJSON, which stands for Human JSON.

https://hjson.github.io/

It has implementations in JavaScript, C#, C++, Go, Java, Lua, PHP, Python, Rust.

joshuat · on Dec 9, 2024

Who has the xkcd comic about standards handy?

nigeltao · on April 4, 2024

> all decoders will render the same pixels

Not true. Even just within libjpeg, there are three different IDCT implementations (jidctflt.c, jidctfst.c, jidctint.c) and they produce different pixels (it's a classic speed vs quality trade-off). It's spec-compliant to choose any of those.

A few years ago, in libjpeg-turbo, they changed the smoothing kernel used for decoding (incomplete) progressive JPEGs, from a 3x3 window to 5x5. This meant the decoder produced different pixels, but again, that's still valid:

https://github.com/libjpeg-turbo/libjpeg-turbo/commit/6d91e9...

JyrkiAlakuijala · on April 4, 2024

Moritz, the author of that improvement, implemented the same for jpegli.

I believe the standard does not specify what the intermediate progressive renderings should look like.

I developed that interpolation mechanism originally for Pik, and Moritz was able to formulate it directly in the DCT space so that we don't need to go into pixels for the smoothing to happen, but he computed it using a few of the low frequency DCT coefficients.

nigeltao · on April 5, 2024

> I believe the standard does not specify what the intermediate progressive renderings should look like.

This is possibly getting too academic, but IIUC for a progressive JPEG, e.g. encoded by cjpeg to have 10 0xDA Start Of Scan markers, it's actually legitimate to post-process the file, truncating to fewer scans (but re-appending the 0xD9 End Of Image marker). The shorter file is still a valid JPEG, and so still relevant for discussing whether all decoders will render the same pixels.

I might be wrong about validity, though. It's been a while since I've studied the JPEG spec.

andrewla · on April 4, 2024

I was not aware of that; I thought that it was pretty deterministic.

Nonetheless, for this particular case, comparing jpegs decoded into lossless formats is unnecessary -- you can simply compare the two jpegs directly based on the default renderer in your browser.

iggldiggl · on April 5, 2024

And nowadays, for subsampled images libjpeg post classic version 6 insists on doing the chroma upscaling using DCT where possible, so for classic 4:2:0 subsampled images (i.e. chroma resolution is half the luma resolution both horizontally and vertically) each subsampled 8x8 chroma block is now upscaled individually to 16x16 for the final image, which can and does introduce additional artefacts at the boundaries between each 16x16 px block in the final image. But the current libjpeg maintainer insists on that new algorithm because it is mathematically more beautiful…

Granted, the introduced artefacts aren't massive, but under certain circumstances they are noticeable, which is how I stumbled across that topic in the first place.

Thankfully, most software that isn't still stuck on libjpeg 6 has switched to libjpeg-turbo or some other library instead which continues using a more sensible algorithm for chroma upscaling.