Are there good open models out there that beat gemini 2.5 flash on price? I often run data extraction queries ("here is this article, tell me xyz") with structured output (pydantic) and wasn't aware of any feasible (= supports pydantic) cheap enough soln :/
> every single product/feature I've used other than the Claude Code CLI has been terrible
yeah they're shipping too fast and everything is buggy as shit
- fork conversation button doesn't even work anymore in vscode extension
- sometimes when I reconnect to my remote SSH in VSCode, previously loaded chats become inaccessible. The chats are still there in the .jsonl files but for some reason the CC extension becomes incapable of reading them.
Batshit situation, respectable position from Dario throughout.
But there's some irony in this happening to Anthropic after all the constant hawkish fearmongering about the evil Chinese (and open source AI sentiment too).
Horrific comparison point. LLM inference is way more expensive locally for single users than running batch inference at scale in a datacenter on actual GPUs/TPUs.
While the idea in the post is an interesting one, the analogy to planing is terrible. The difference in results from a power planer and a hand plane (even with a pretty basic blade) is night and day. Wood planed with high quality and sharp steel has a finish that doesn't even need oil or varnish.
People talk about how non-AI code will become an artisanal craft and I think it's a bit of a stretch. The one exception might be when code has an intrinsic aesthetic quality in itself, rather than just the functional output, something like the obfuscated C code competition entries. Hand-worked wood might be crappy too, like a school woodwork birdhouse project made by a beginner, but a truly artisanally crafted piece of furniture or cabinetry has a very tangibly superior output to an IKEA bookshelf or other industrial stuff.
On the point of doing work for the sake of doing work and not for the sake of the value of the output, this is nothing new, as suggested in the blog post. But the more apt analogy would be all the "bullshit jobs" that have existed for decades in modern corporations. People who expand their teams to justify more budget to hire more people to create more work to expand their teams to get bigger budgets, etc. All the while producing nothing of real value in the company. The thing that AI seems to have done is accelerated and exaggerated this tendency, maybe since it was already the natural tendency within the logic of our corporate work culture.
> Special shout out to Google who to this date seem to not support tool call streaming which is extremely Google.
Google doesn't even provide a tokenizer to count tokens locally. The results of this stupidity can be seen directly in AI studio which makes an API call to count_tokens every time you type in the prompt box.
I have a SKILL.md for marimo notebooks with instructions in the frontmatter to always read it before working with marimo files. But half the time Claude Code still doesn't invoke it even with me mentioning marimo in the first conversation turn.
I've resorted to typing "read marimo skill" manually and that works fine. Technically you can use skills with slash commands but that automatically sends off the message too which just wastes time.
But the actual concept of instructions to load in certain scenarios is very good and has been worth the time to write up the skill.
reply