This article is obviously AI generated and besides being jarring to read, it makes me really doubt its validity. You can get substantially faster parsing versus `JSON.parse()` by parsing structured binary data, and it's also faster to pass a byte array compared to a JSON string from wasm to the browser. My guess is not only this article was AI generated, but also their benchmarks, and perhaps the implementation as well.
> it may be less work to do something than to express your desires to an agent perfectly well
As I use AI more and more to write code I find myself just implementing something myself more and more for this reason. By the time I have actually explained what I want in precise detail it's often faster to have just made the change myself.
Without enough detail SOTA models can often still get something working, but it's usually not the desired approach and causes problems later.
Why would Apple Intelligence bother them? It's very unobtrusive and actually useful when it's visible. I literally don't notice it except when it's helpful.
Good. I want to see more lawsuits going after these hyper scalers for blatantly disregarding copyright law while simultaneously benefiting from it. In a just world they would all go down and we would be left with just the OSS models. But we don't live in a fair world :(
Once again, a promising article is completely ruined by blatant ai-isms. I could only make to the end of the pointer section before I couldn't take it anymore.
There is a real crisis of AI slop getting posted to this forum. I don't even bother reading posted articles related to AI anymore, but now it's seemingly extending to everything.
Hey, the author of the blog here. English is not my native language. I did try to write the first section, but for the later section, I wrote a rough draft, but then I had that intrusive thought that people would not like it because of my poor English and too many typos, so I just told LLM to make it better, and that's how it ended up like this. While my intentions were right, my method was wrong. I will improve and fix my writing. I have taken down the blog for now. I don't want to waste other people's time. While it has good content, the presentation is wrong!
"This means reading light data requires zero locks. No mutex, no spinlock, nothing." threw up red flags, and by the time I got to "But here’s the insight" I couldn't go any further.
We are going to need some proof on cleanliness for art, writing etc. At some point it will become impossible to tell if you are interacting with a bot and that is when the internet dies.
i would genuinely rather read the rough draft before it got turned into this slop. it would be messier, maybe, but it’d have actual human insight and direction.
I’ve been trying to put my finger on what gives it away. It’s that there are boolean trees underneath each text decision it makes. While humans are obviously capable of that, our conclusions and framing are more continuous. This why you for example see LLMs constantly defining things by what they’re not.
The main issue is sota LLMs can only reason one way - forwards, and can't go back and revise a prior statement. That would remove a whole lot of "it's not this is that" and "the big takeaway here is" and so on. Those kinds of ideas are typically at the beginning of a human writer's output structure. An LLM can't go back and edit the first paragraph, because it has to reason (for whatever that means for LLM) it's way through it to get to the big idea of the paragraph/structure. I haven't played with diffusion text models enough to know if that's a remedy for that kind of output.
When LLMs are good enough to not be detectable, what happens then? They aren't that far away atm, so it's only a matter of time until _everyone_ is assumed to be an LLM.
LLMs are trained to be precise (and more specifically: semantically precise), especially in the fine-tuning phase. An LLM just trained on the corpus of full human production would surely sound more "human," but it would also probably be pretty useless. So that's why idioms like "it's not X, it's Y" are a dead giveaway; but really, any structure that tries to "guide" our salience is a dead giveaway. Here's a random paragraph from Knuth's Literate Programming†[1]:
> For example, a system program is often designed to gather statistics about its own operation, but such statistics-gathering is pointless unless someone is actually going to use the results. In order to make the instrumentation code optional, I include the word ‘stat’ just before any special code for statistics, and ‘tats’ just after such code; and I tell WEAVE to regard stat and tats as if they were begin and end. But stat and tats are actually simple macros.
I encourage you to read that paragraph a few times. Even if you have no idea what the context is, you get that there's a point, that there's something else to dig into, that the author might be being a bit cheeky. In other words, you can feel Knuth behind the ink. Philosophers would call this intentionality[2]. LLMs produce the polar opposite of garden path sentences[3] (and, imo, that's why they're so easy to spot).
† I specifically picked something technical to illustrate that even in domains where semantic precision is of utmost importance, human expression is still just that: human.
I fed ChatGPT part of the introductory paragraph of that paper and instructed it to continue. It gave me this:
> In this light, the activity of programming becomes less an act of issuing commands and more an act of communication. The computer is, after all, an obedient but uncomprehending servant; it will execute whatever precise instructions we provide. But our colleagues, our future selves, and the broader community of readers are not so easily satisfied. They demand clarity, intention, and narrative. A program, then, should be structured not merely for execution, but for reading—its logic unfolding in a manner that mirrors the way one might naturally explain the solution to another person.
> This shift in perspective has practical consequences. When we write with exposition in mind, we are compelled to confront ambiguities that might otherwise remain hidden. Vague assumptions must be made explicit; convoluted steps must be reorganized into simpler, more digestible ideas. The discipline of explaining a program often leads to improvements in the program itself, since confusion in the prose is frequently a symptom of confusion in the underlying design.
Fascinating technology. I would not be able to immediately tell this was AI generated. So these models can in some cases produce text that doesn't immediately set off alarm bells. As an avid reader and writer I'm not really sure what to make of it. I don't want to consume AI generated art or literature because it's completely besides the point, but in the future will we even be able to tell? How do we even know if anyone around us is real? Could they just be sufficiently advanced LLM's, fooling us? Am I the only human in the matrix?
Whether or not one can tell it's AI generated, one can certainly tell it's not Knuth. For one thing, the writing style is very different. Not that there haven't been other great computer scientists who may have written in this style, but it definitely doesn't sound like Knuth (there is no "being a bit cheeky" for sure). But also, the ideas it has produced are simply more of the same; kind of a natural progression / what a typical grad student may write. Knuth always has something new and surprising to say in every paragraph, he wouldn't harp on a theme like this. Also he mixes “levels” between very high and very low, while the paragraphs you quoted stay at a uniform level.
But of course, writing as good as a grad student's (just not the particular delightful idiosyncratic style of a specific person) is still very impressive and amazing, so your concerns are still valid.
Knuth's paper is 100% in the training set, so while your result is decent, it's undoubtedly tainted. But let's look at the output anyway:
> ...the activity of programming becomes less an act of issuing commands and more an act of communication
directly contradicts:
> The computer is, after all, an obedient but uncomprehending servant...
If programming becomes "an act of communication" how can an "uncomprehending servant" make heads or tails of what I'm telling it? And I get that the two aren't exactly contradictory here, but this implied claim would certainly require at least a throwaway sentence.
> When we write with exposition in mind, we are compelled to confront ambiguities that might otherwise remain hidden.
I'm being a bit nitpicky, but this is a non-sequitur; we aren't necessarily required to confront any ambiguities, even when we're trying very hard to be expository. The counter-examples I'm thinking of at the moment are contrived (amnesia, my four-year-old niece trying to tell a story, etc.) but I mainly take issue with the word "compelled."
> its logic unfolding in a manner that mirrors the way one might naturally explain the solution to another person
People explain things in all kinds of weird circuitous ways, so while this (as all AI-generated output) seems interesting prima facia, it's actually kind of a dud when you think about it for more than 5 seconds.
> Vague assumptions must be made explicit; convoluted steps must be reorganized into simpler, more digestible ideas.
and
> ...ambiguities that might otherwise remain hidden...
directly contradicts:
> ...whatever precise instructions we provide
It seems like the computer can somehow encode "ambiguities" and "vague assumptions" as "precice instructions." How, exactly, does that work? (Spoiler: it doesn't, it's gibberish.) On the other hand, if you read Knuth's first few paragraphs, he clearly has a point in mind; I'd even say he's being a bit wordy, but never equivocating. In fact, by the fourth paragraph, he's almost giddy with excitement.
reply