More

j_maffe · 2026-02-20T23:46:00 1771631160

That's great for you but unfortunately the overwhelming majority of people do indeed regularly use these features.

j_maffe · 2026-02-16T08:34:51 1771230891

Right. But, unlike AI, we are usually aware when we're lacking context and inquire before giving an answer.

dxdm · 2026-02-16T09:04:10 1771232650

Wouldn't that be nice. I've been party and witness to enough misunderstandings to know that this is far from universally true, even for people like me who are more primed than average to spot missing context.

j_maffe · 2026-02-16T17:02:06 1771261326

I never said it's universally true.

mlrtime · 2026-02-16T12:11:06 1771243866

TIL my wife may be AI!

j_maffe · 2026-02-16T08:22:18 1771230138

You can't "patch" LLM's in 4 hours and this is not the kind of question to trigger a web search

vimda · 2026-02-16T12:22:33 1771244553

You absolutely can, either through the system prompt or by hardcoding overrides in the backend before it even hits the LLM, and I can guarantee that companies like Google are doing both

tlogan · 2026-02-16T12:33:03 1771245183

This has been viral on Tiktok far at least one week. Not really 4 hours.

nroets · 2026-02-16T08:46:35 1771231595

You can pattern match on the prompt (input) then (a) stuff the context with helpful hints to the LLM e.g. "Remember that a car is too heavy for a person to carry" or (b) upgrade to "thinking".

throwuxiytayq · 2026-02-16T09:32:44 1771234364

Yes, I’m sure that’s what engineers at Google are doing all day. That, and maintaining the moon landing conspiracy.

anonymous_user9 · 2026-02-16T10:58:22 1771239502

If they aren't, they should be (for more effective fraud). Devoting a few of their 200,000 employees to make criticisms of LLMs look wrong seems like an effective use of marketing budget.

rluna828 · 2026-02-17T19:22:21 1771356141

It looks like they do. https://simonwillison.net/2025/May/25/claude-4-system-prompt... They patch it in the prompt and they eventually address it in the re-enforcement training. It seems the eventual goal is to patch all of these tiny "glitches" so as to hide the lack of cognition.

londons_explore · 2026-02-16T08:48:19 1771231699

A tiny bit of fine-tuning would take minutes...

j_maffe · 2026-02-07T18:25:16 1770488716

The timed-reveal aspect is also interesting.

data_maan · 2026-02-07T22:42:04 1770504124

How is that interesting for a scientific point of view? This seems more like a social experiment dressed as science.

Science should be about reproducibility, and almost nothing here is reproducible.

bwfan123 · 2026-02-08T17:24:27 1770571467

> Science should be about reproducibility, and almost nothing here is reproducible.

I can see your frustration. You are looking for reproducible "benchmarks". But you have to realize several things.

1) research level problems are those that bring the "unknown" into the "known" and as such are not reproducible. That is why "creativity" has no formula. There are no prescribed processes or rules for "reproducing" creative work. If there were, then they would not be considered "research".

2) things learnt and trained are already in the realm of the "known", ie, boiler-plate, templated and reproducible.

The problems in 2) above are where LLMs excel, but they have been hyped into excelling at 1) as well. And this experiment is trying to test that hypothesis.

cowsandmilk · 2026-02-08T03:22:10 1770520930

Deepmind’s Nobel Prize was primarily for its performance in CASP which is pretty much exactly this. Labs solve structures of proteins, but don’t publish them until after all the computational teams predict structures.

So I’m not sure where you’re coming from claiming that this isn’t scientific.

data_maan · 2026-02-08T08:23:22 1770539002

It wasn't like this in any way.

CASP relies on a robust benchmark (not just 10 random proteins), and has clear participation criteria, objective metrics how the eval plays out, etc.

So I stand by my claim: This isn't scientific. If CASP is Japan, a highly organized & civilized society, this is a banana republic.

thesmtsolver2 · 2026-02-08T01:28:48 1770514128

Reproducibility is just one aspect of science, logic + reasoning from principles and data is the major aspect.

There are some experiments which cannot be carried out more than once.

data_maan · 2026-02-08T08:24:48 1770539088

> There are some experiments which cannot be carried out more than once

Yes, in which case a very detailed methodology is required: which hardware, runtimes, token counts etc.

This does none of that.

j_maffe · 2026-02-05T19:49:36 1770320976

This hypothesis is tested regularly by plenty of live benchmarks. The services usually don't decay in performance.

j_maffe · 2026-02-03T17:44:18 1770140658

There's no memory to be needed. The US is officially an unreliable ally. They have been for several decades now. They will continue to be so. EU politicians might've been overly ambitious but they're not naive.

j_maffe · 2026-02-03T17:40:16 1770140416

Honest to god, everything that Trump is doing might actually end up being that the world becomes a better place. The US hegemony really ran its course.

tyre · 2026-02-03T17:54:37 1770141277

The US hegemony has been a tremendous boon to the world. Yes, the US has done terrible things (lots in South America, Vietnam, genocide in East Timor, failed nation building and war crimes in the Middle East, support for genocide in Palestine, etc.) This isn’t to minimize that.

But the reality is that the US benefits immensely from free democracies with rules-based open markets and international order. Again, do we break that when it suits us? Absolutely. But America being selfish has been a positive outcome compared to, for example, more war in Europe.

Polls consistently show that people recognize the benefits of US hegemony while acknowledging that the US does it purely from self-interest.

j_maffe · 2026-02-03T21:54:38 1770155678

> But the reality is that the US benefits immensely from free democracies

Would you like for me to start counting the number of times the US helped install a democracy vs the number of times it installed dictatorships?

tyre · 2026-02-04T00:24:05 1770164645

I’m well aware. (Probably any American who can name East Timor, let alone is aware of our participation in their genocide, is more likely than not to be familiar with our history.)

What you said doesn’t discount that we are better with free democracies, regardless of whether we see that through. Democracies tend to raise the per capita income across the population, which, in concert with free markets, gives our multinational corporations new markets to sell shit to.

Sometimes we have other more pressing concerns, like oil in Iran/Iraq (a democracy destroyed and created, respectively); global shipping / colonialism in our support of Israel in conflicts with Egypt over the Suez; abandoning our Kurdish allies to keep Turkey happy enough to keep military assets there.

Geopolitics doesn’t always do one thing or another, even if it were perfectly rational. And no foreign policy is that.

melesian · 2026-02-03T18:07:04 1770142024

The absence of war in Europe is more down to the EU than the US. Polls do not consistently show anything of the sort.

tyre · 2026-02-03T20:01:57 1770148917

Yes, they do:

https://www.pewresearch.org/global/2023/06/27/international-...

People globally have routinely acknowledged that:

1. The US is a hegemony that meddles in others’ affairs

2. It does so selfishly, despite the high flying rhetoric about freedom, democracy, etc.

3. This is good

The preconditions for absence of war in Europe came before the EU existed and has to do with the post-WWII balance of power, which was heavily driven by the United States.

j_maffe · 2026-02-03T21:53:13 1770155593

The poll you reference didn't ask a single country in the middle east (except Israel, of course!) or most of South American countries (half of which puppet governments were placed). I wonder why. And you really argue with a straight face that this is representative of "People globally"? I'd be more insulted by your attempt of slight if PEW wasn't doing the same.

tyre · 2026-02-04T00:16:14 1770164174

I’m happy to review other data if you have it!

I’m not at all suggesting that the United States’s history isn’t fucked. I would suggest, though, that many people recognize that there are tradeoffs and having a single global superpower provides stability in exchange for freedom.

Yes, I’m aware of our history overturning democracies in South America or south east Asia due to communism or Iran because of oil. I’m also aware of efforts to install democracies (e.g. Iraq) not being about freedom. The people polled understand this as well.

I’m curious which counter factual reality you think would be better? Be specific as to what it would look like; who the regional powers would be; how they would cooperate / interfere with each other; what wars would look like, including frequency, between regional powers vs. today; whether states within their sphere of influence would be required to participate in these wars, etc etc etc.

j_maffe · 2026-02-02T12:44:20 1770036260

Why should I bother to read an article that the "author" didn't write? Might as well just go prompt Claude. Or is this about saving tokens?

brap · 2026-02-02T12:48:36 1770036516

I don’t see it as the author being lazy, actually the opposite, I see it as being performative and a tryhard. Either way it’s annoying and doesn’t make me want to read it.

After looking into it, as I suspected, the author seems to make his living by selling people the feeling that they’re in the cutting edge of the AI world. Whether or not the feeling is true I don’t know, but with this in mind this performance makes sense.

j_maffe · 2026-01-26T17:03:44 1769447024

Perhaps they're pointing out the level of double standards in condemnation China gets compared to the US, lack of censorship notwithstanding.

rwmj · 2026-01-26T17:04:59 1769447099

Are you saying we cannot talk about the bad things the US has done?

j_maffe · 2026-01-26T17:05:41 1769447141

No I'm saying we can, unlike how it is in China. Besides that point, I think GP is arguing that China is villinized more than the US.

torginus · 2026-01-26T18:12:34 1769451154

I'm pretty sure if you criticise the US on something they care about, you posts will disappear from social media pretty quickly. Not because of political censorship but because of Trust and Safety violations

spankalee · 2026-01-26T17:10:11 1769447411

Are you actually claiming the US is not criticized here?

johnjames87 · 2026-01-26T17:09:35 1769447375

The US govt doesn't force censorship of its history, good or bad.

exe34 · 2026-01-26T17:24:01 1769448241

they do it differently. the executive just lies to you while you watch a video of what's really happening, and if you start protesting, you're a domestic terrorist. or a little piggy, if you ask awkward questions.

entropicdrifter · 2026-01-26T17:27:10 1769448430

It tries to, in bouts

j_maffe · 2026-01-24T00:30:47 1769214647

It doesn't matter how it's stored. So long as it isn't E2EE, they (and anyone who can ask for it) will be able to access the drives