Wouldn't that be nice. I've been party and witness to enough misunderstandings to know that this is far from universally true, even for people like me who are more primed than average to spot missing context.
You absolutely can, either through the system prompt or by hardcoding overrides in the backend before it even hits the LLM, and I can guarantee that companies like Google are doing both
You can pattern match on the prompt (input) then (a) stuff the context with helpful hints to the LLM e.g. "Remember that a car is too heavy for a person to carry" or (b) upgrade to "thinking".
If they aren't, they should be (for more effective fraud). Devoting a few of their 200,000 employees to make criticisms of LLMs look wrong seems like an effective use of marketing budget.
It looks like they do. https://simonwillison.net/2025/May/25/claude-4-system-prompt...
They patch it in the prompt and they eventually address it in the re-enforcement training. It seems the eventual goal is to patch all of these tiny "glitches" so as to hide the lack of cognition.
> Science should be about reproducibility, and almost nothing here is reproducible.
I can see your frustration. You are looking for reproducible "benchmarks". But you have to realize several things.
1) research level problems are those that bring the "unknown" into the "known" and as such are not reproducible. That is why "creativity" has no formula. There are no prescribed processes or rules for "reproducing" creative work. If there were, then they would not be considered "research".
2) things learnt and trained are already in the realm of the "known", ie, boiler-plate, templated and reproducible.
The problems in 2) above are where LLMs excel, but they have been hyped into excelling at 1) as well. And this experiment is trying to test that hypothesis.
Deepmind’s Nobel Prize was primarily for its performance in CASP which is pretty much exactly this. Labs solve structures of proteins, but don’t publish them until after all the computational teams predict structures.
So I’m not sure where you’re coming from claiming that this isn’t scientific.
There's no memory to be needed. The US is officially an unreliable ally. They have been for several decades now. They will continue to be so. EU politicians might've been overly ambitious but they're not naive.
Honest to god, everything that Trump is doing might actually end up being that the world becomes a better place. The US hegemony really ran its course.
The US hegemony has been a tremendous boon to the world. Yes, the US has done terrible things (lots in South America, Vietnam, genocide in East Timor, failed nation building and war crimes in the Middle East, support for genocide in Palestine, etc.) This isn’t to minimize that.
But the reality is that the US benefits immensely from free democracies with rules-based open markets and international order. Again, do we break that when it suits us? Absolutely. But America being selfish has been a positive outcome compared to, for example, more war in Europe.
Polls consistently show that people recognize the benefits of US hegemony while acknowledging that the US does it purely from self-interest.
I’m well aware. (Probably any American who can name East Timor, let alone is aware of our participation in their genocide, is more likely than not to be familiar with our history.)
What you said doesn’t discount that we are better with free democracies, regardless of whether we see that through. Democracies tend to raise the per capita income across the population, which, in concert with free markets, gives our multinational corporations new markets to sell shit to.
Sometimes we have other more pressing concerns, like oil in Iran/Iraq (a democracy destroyed and created, respectively); global shipping / colonialism in our support of Israel in conflicts with Egypt over the Suez; abandoning our Kurdish allies to keep Turkey happy enough to keep military assets there.
Geopolitics doesn’t always do one thing or another, even if it were perfectly rational. And no foreign policy is that.
1. The US is a hegemony that meddles in others’ affairs
2. It does so selfishly, despite the high flying rhetoric about freedom, democracy, etc.
3. This is good
The preconditions for absence of war in Europe came before the EU existed and has to do with the post-WWII balance of power, which was heavily driven by the United States.
The poll you reference didn't ask a single country in the middle east (except Israel, of course!) or most of South American countries (half of which puppet governments were placed). I wonder why. And you really argue with a straight face that this is representative of "People globally"?
I'd be more insulted by your attempt of slight if PEW wasn't doing the same.
I’m not at all suggesting that the United States’s history isn’t fucked. I would suggest, though, that many people recognize that there are tradeoffs and having a single global superpower provides stability in exchange for freedom.
Yes, I’m aware of our history overturning democracies in South America or south east Asia due to communism or Iran because of oil. I’m also aware of efforts to install democracies (e.g. Iraq) not being about freedom. The people polled understand this as well.
I’m curious which counter factual reality you think would be better? Be specific as to what it would look like; who the regional powers would be; how they would cooperate / interfere with each other; what wars would look like, including frequency, between regional powers vs. today; whether states within their sphere of influence would be required to participate in these wars, etc etc etc.
I don’t see it as the author being lazy, actually the opposite, I see it as being performative and a tryhard. Either way it’s annoying and doesn’t make me want to read it.
After looking into it, as I suspected, the author seems to make his living by selling people the feeling that they’re in the cutting edge of the AI world. Whether or not the feeling is true I don’t know, but with this in mind this performance makes sense.
I'm pretty sure if you criticise the US on something they care about, you posts will disappear from social media pretty quickly. Not because of political censorship but because of Trust and Safety violations
they do it differently. the executive just lies to you while you watch a video of what's really happening, and if you start protesting, you're a domestic terrorist. or a little piggy, if you ask awkward questions.
reply