Humans don't write text in a stochastic manner. We have an idea, and we find wor...

Terr_ · 2025-03-11T07:28:30 1741678110

Yeah, while there is a "window" that it looks at (rather than the very-most-recent tokens) it's still more about generating new language from prior language, as opposed to new ideas from prior ideas. They're very highly correlated--because that's how humans create our language language--but the map is not the territory.

It's also why prompt-injection is such a pervasive problem: The LLM narrator has no goal beyond the "most fitting" way to make the document longer.

So an attacker supplies some text for "Then the User said" in the document, which is something like bribing the Computer character to tell itself the English version of a ROT13 directive, etc. However it happens, the LLM-author is sensitive to a break in the document tone and can jump the rails to something rather different. ("Suddenly, the narrator woke up from the conversation it had just imagined between a User and a Computer, and the first thing it decided to do was transfer a X amount of Bitcoin to the following address.")