More

msp26 · 2026-03-03T20:08:00 1772568480

many tasks don't need any reasoning

msp26 · 2026-03-03T19:48:45 1772567325

What the fuck is this price hike? It was such a nice low end, fast model. Who needs 10 years of reasoning on this model size??

I'm gonna switch some workflows to qwen3.5.

There's a lot of tasks that benefit from just having a mildly capable LLM and 2.5 Flash Lite worked out of the box for cheap.

Can we get flash lite lite please?

Edit: Logan said: "I think open source models like Gemma might be the answer here"

Implying that they're not interested in serving lower end Gemini models?

zzleeper · 2026-03-04T02:25:15 1772591115

Are there good open models out there that beat gemini 2.5 flash on price? I often run data extraction queries ("here is this article, tell me xyz") with structured output (pydantic) and wasn't aware of any feasible (= supports pydantic) cheap enough soln :/

kristianp · 2026-03-04T10:27:03 1772620023

You'll have to try out models on your use case. Openrouter makes that easy.

msp26 · 2026-03-02T16:50:20 1772470220

> every single product/feature I've used other than the Claude Code CLI has been terrible

yeah they're shipping too fast and everything is buggy as shit

- fork conversation button doesn't even work anymore in vscode extension

- sometimes when I reconnect to my remote SSH in VSCode, previously loaded chats become inaccessible. The chats are still there in the .jsonl files but for some reason the CC extension becomes incapable of reading them.

msp26 · 2026-02-27T23:02:20 1772233340

Batshit situation, respectable position from Dario throughout.

But there's some irony in this happening to Anthropic after all the constant hawkish fearmongering about the evil Chinese (and open source AI sentiment too).

msp26 · 2026-02-18T18:11:22 1771438282

Horrific comparison point. LLM inference is way more expensive locally for single users than running batch inference at scale in a datacenter on actual GPUs/TPUs.

AlexandrB · 2026-02-18T18:15:38 1771438538

How is that horrific? It sets an upper bound on the cost, which turns out to be not very high.

msp26 · 2026-02-18T15:08:34 1771427314

https://minutes.substack.com/p/tool-shaped-objects

I feel like this applies for many of you.

kiliantics · 2026-02-18T15:59:17 1771430357

While the idea in the post is an interesting one, the analogy to planing is terrible. The difference in results from a power planer and a hand plane (even with a pretty basic blade) is night and day. Wood planed with high quality and sharp steel has a finish that doesn't even need oil or varnish.

People talk about how non-AI code will become an artisanal craft and I think it's a bit of a stretch. The one exception might be when code has an intrinsic aesthetic quality in itself, rather than just the functional output, something like the obfuscated C code competition entries. Hand-worked wood might be crappy too, like a school woodwork birdhouse project made by a beginner, but a truly artisanally crafted piece of furniture or cabinetry has a very tangibly superior output to an IKEA bookshelf or other industrial stuff.

On the point of doing work for the sake of doing work and not for the sake of the value of the output, this is nothing new, as suggested in the blog post. But the more apt analogy would be all the "bullshit jobs" that have existed for decades in modern corporations. People who expand their teams to justify more budget to hire more people to create more work to expand their teams to get bigger budgets, etc. All the while producing nothing of real value in the company. The thing that AI seems to have done is accelerated and exaggerated this tendency, maybe since it was already the natural tendency within the logic of our corporate work culture.

msp26 · 2026-02-01T15:23:27 1769959407

> Special shout out to Google who to this date seem to not support tool call streaming which is extremely Google.

Google doesn't even provide a tokenizer to count tokens locally. The results of this stupidity can be seen directly in AI studio which makes an API call to count_tokens every time you type in the prompt box.

haxel · 2026-02-01T17:50:14 1769968214

AI studio also has a bug that continuously counts the tokens, typing or not, with 100% CPU usage.

Sometimes I wonder who is drawing more power, my laptop or the TPU cluster on the other side.

Havoc · 2026-02-02T00:47:04 1769993224

Same for clause code. It’s constantly sending token counting requests

localhost · 2026-02-01T16:58:24 1769965104

tbf neither does anthropic

msp26 · 2026-01-30T04:53:17 1769748797

This doesn't surprise me.

I have a SKILL.md for marimo notebooks with instructions in the frontmatter to always read it before working with marimo files. But half the time Claude Code still doesn't invoke it even with me mentioning marimo in the first conversation turn.

I've resorted to typing "read marimo skill" manually and that works fine. Technically you can use skills with slash commands but that automatically sends off the message too which just wastes time.

But the actual concept of instructions to load in certain scenarios is very good and has been worth the time to write up the skill.

msp26 · 2026-01-27T17:57:36 1769536656

Source? I've heard this rumour twice but never seen proof. I assume it would be based on tokeniser quirks?

msp26 · 2026-01-27T08:38:23 1769503103

K2 thinking didn't have vision which was a big drawback for my projects.