I'd be interested to see results with Opus 4.6 or 4.5
Also, I bet the quality of these docs vary widely across both human and AI generated ones. Good Agents.md files should have progressive disclosure so only the items required by the task are pulled in (e.g. for DB schema related topics, see such and such a file).
Then there's the choice of pulling things into Agents.md vs skills which the article doesn't explore.
I do feel for the authors, since the article already feels old. The models and tooling around them are changing very quickly.
Agree that progressive disclosure is fantastic, but
> (e.g. for DB schema related topics, see such and such a file).
Rather than doing this, put another AGENTS.md file in a DB-related subfolder. It will be automatically pulled into context when the agent reads any files in the file. This is supported out of the box by any agent worth its salt, including OpenCode and CC.
IMO static instructions referring an LLM to other files are an anti-pattern, at least with current models. This is a flaw of the skills spec, which refers to creating a "references" folder and such. I think initial skills demos from Anthropic also showed this. This doesn't work.
> This is supported out of the box by any agent worth its salt, including OpenCode and CC.
I thought Claude Code didn't support AGENTS.md? At least according to this open issue[0], it's still unsupported and has to be symlinked to CLAUDE.md to be automatically picked up.
You're right, for CC it's "nested CLAUDE.md files". The support I meant was about the "automatic inclusion in context upon sibling-or-child file touch" feature, rather than the name of the file.
Fair, I was hoping there was a feature that I was missing. Minor papercut to have to include harness-specific files/symlinks in your repo but it's probably a temporary state until the tools and usage patterns are more settled.
Nah, this is intentional by Anthropic, out of the top 20 coding agents 19 support AGENTS.md (fake numbers but I've seen someone else go through them). It's just a dumb IE6-style strategy.
If you have for example a monorepo, then you'll probably want a super lean top-level one - could be <15 lines - and then one per app. In those, only stuff that applies to the app as a whole. Then feature-specific context can be put at the level of the feature - hopefully your codebase is structured by domain rather than layer! The feature-level ones too, IMO, should usually be <15 lines. I just checked one of ours, it's 80 (GPT-5) tokens. It's basically answering potential "is this intentional?" questions - things that an LLM (or fresh human) can't possibly know the answer to because they're product decisions that aren't expressed in code. Tribal knowledge that would be in a doc somewhere. For 99% of decisions it's not needed, but there's that 1% where we've made a choice that goes against the cookie-cutter grain. If we don't put that in an AGENTS file, then every single time it's relevant there's a good chance it will make a wrong assumption. Or sometimes, a certain mechanic is inferable from the code, but it would take 10 different file reads to figure out something that is core to how the feature works, and takes 2 sentences to explain. Then it just saves a whole lot of time.
It does depend on the domain. If you're developing the logic for a game, you'll need more of them and they'll be longer.
Another advantage of this split is that because they're pulled into context at just the right time, the attention layer generally does a better job of putting sufficient importance on it during that part of the task, compared to if it were in the project-level AGENTS file that was loaded at the very top of the conversation.
Progressive disclosure is good for reducing context usage but it also reduces the benefit of token caching. It might be a toss-up, given this research result.
I was trying to get it to create an image of a tiger jumping on a pogo stick, which is way beyond its capabilities, but it cannot create an image of a pogo stick in isolation.
When given an image of an empty wine glass, it can't fill it to the brim with wine. The pogo stick drawers and wine glass fillers can enjoy their job security for months to come!
This is where smaller models are just going to be more constrained and will require additional prompting to coax out the physical description of a "pogo stick". I had similar issues when generating Alexander the Great leading a charge on a hippity-hop / space hopper.
You are right, just tried even with reference images it can't do it for me. Maybe with some good prompting.
Because in theory I would say that knowledge is something that does not have to be baked in the model but could be added using reference images if the model is capable enough to reason about them.
This is awesome. If you want help from other contributors, you'll probably find it easier to collaborate if you move memchess into its own repo (vs storing it in the grondilu.github.io repo). Each repo can have a dedicated GH Pages.
You get your own space that's not a cookie cutter box, like a hotel. Also, you generally stay in a part of town that is residential and not filled with other hotels and business. When it goes well, it feels like you really get to know the town you're visiting.
My least favorite elevator control panel is the one that is not in the elevator and is a touch screen.
If the control panel is outside the elevator, you can't change your mind, you can't push a button for someone else, and sometimes you can't even verify what floor you're being sent to.
Call panel not in the cab -- so you must be talking about a "destination dispatch" system. With DD, you go to a call kiosk, and enter your desired destination floor, and the kiosk tells you which car to take. DD systems are done because it allows for much more sophisticated scheduling, and larger elevator lobbies with longer walk times, the net result being much better capacity, especially at peak traffic times.
In the cab you will typically have a button to take you back to the lobby in case you didn't get off on your desired floor and can't enter another call.
Modern destination dispatch systems are integrated with RF badge readers so that the elevator system can be programmed with your office floor, and so you don't need to go to the kiosk at all, just walk past it, and it will schedule your to a car to take you to your office floor.
The scheduler can group passengers together to get fewer stops, and longer express zones. Also, it makes double-deck elevators practical. Consider the elevators in the Willis Tower in Chicago. These are double-deck cabs, and it takes about three floors for the cab to accelerate to or decellerate from express speed and make a stop. Considering the number of people that system moves daily, it would be impossible to do without DD.
"Thank you for using SortTable from https://www.kryogenix.org/code/browser/sorttable/. If you, the site owner, would like to use SortTable without this notice, please download the script from [here](URL link) and add it to your server."
If you want to get fancy you can serve the modified script only if it's being hot linked and the original script if it's being used on www.kryogenix.org - or another website that has permission to use the script.
I think this is a brilliant solution - you can also make it the second row in the table that's being sorted (right after the table header). That way there's no breakage and it's a good enough deterrent.
I would go on further and make a link to a Stripe Checkout page that allows you to purchase a license subscription to remove the notice.
Also, I bet the quality of these docs vary widely across both human and AI generated ones. Good Agents.md files should have progressive disclosure so only the items required by the task are pulled in (e.g. for DB schema related topics, see such and such a file).
Then there's the choice of pulling things into Agents.md vs skills which the article doesn't explore.
I do feel for the authors, since the article already feels old. The models and tooling around them are changing very quickly.
reply