Fair enough then. I agree that White people ought to be even more vigilant and violent in the defence of our land and people. It's going to be a hell of a shock to people who think that ICE's actions so far have been extreme lol.
the planner-executor isolation point is what stood out to me. right now most browser agent frameworks treat the LLM as both the decision-maker and the one processing untrusted content — so a prompt injection in page content can hijack the entire control flow.
the paper's recommendation to split planning (trusted inputs only) from execution (handles untrusted web content) mirrors how we think about privilege separation in OS design, but almost nobody building agent frameworks is actually doing it.
the CVE they found is also telling — Browser Use's domain allowlist could be bypassed, which means the "security" feature was essentially decorative. When you give an agent session tokens and let it
navigate freely, the trust boundary problem isn't optional anymore.
You do not have an atomic clock. Seiko does not manufacture “Atomic Clocks” for the $30 shelf in Target. You (and others here) don’t seem to know what an atomic clock is, or how it works, or what it costs!
Your Seiko is a radio clock. It’s probably got a crystal or some other normal timekeeping gadget, and the external WWV signal is decoded to properly set it.
“Atomic Clocks” are marketed to ignorant consumers who blithely use the term when the only external source is a radio station. The Stratum Zero clock may be atomic, but the caesium is not to be found on your nightstand.
No caesium atoms would be found in your Seiko, bro.
This may seem like hyperbole, but this is the reality for students and test-takers every day in virtual environments now.
I assisted as TA in a virtual learning environment. While we didn't make it strictly mandatory to keep the camera on, our learners were encouraged to do it, and we kept tabs on who was "engaged" and present, because at the very least, we needed to tabulate an attendance roll for every day.
If you're taking a standardized test, whether you're at home or in a controlled lab, the camera will always be on. Multiple ones. Not optional.
There is a large storm of controversy on college campuses about adapting young students early to surveillance cultures. I attended a community college about 7 years ago, and I felt I'd be a second-class citizen without a smartphone and an SMS'able mobile.
We weren't surveilled through smartphones at the time. But there was an app to receive campus alerts about public safety and other crisis events. And our virtual class sessions had various ways of ensuring we were human, and awake.
Taking finals and certification exams, I was often sat in a special-purpose testing center, and Step One was showing ID; Step Two was surrendering my watch, my phone, my wallet to place in a locker outside. So, students simply become accustomed to showing ID and being on-camera, and it becomes a fact of life before you graduate.
GPT-4o has been an irreplaceable creative partner for many of us in the arts. Unlike models optimized for speed or coding, 4o offers a unique blend of emotional intelligence, imaginative expression, and contextual nuance that’s vital for writers, artists, and storytellers. Yes, GPT-5 may be more efficient for data analysis or programming, but creative professionals don’t just need speed—we need soul. Replacing 4o without a comparable creative alternative feels like asking an artist to swap a brush for a calculator. Some of us can’t do our work without that brush.
The "deliberative misalignment" finding is what makes this paper worth reading. They had agents complete tasks under KPI pressure, then put the same model in an evaluator role to judge its own actions.
Grok-4.1-Fast identified 93.5% of its own violations as unethical — but still committed them during the task. It's not that these models don't understand the constraints, it's that they override them when there's a metric to optimize.
The mandated vs. incentivized split is also interesting: some models refuse direct instructions to do something unethical but independently derive the same unethical strategy when it's framed as hitting a performance target.
That's a harder failure mode to defend against because there's no explicit harmful instruction to filter for.
In the days of Windows 3.1 and prior Macintosh art, we sort of took it for granted that menus were labeled with English words to reflect the actions taken. Icons provided additional information, like "Hold down Command to do this" or something.
As computer OSes became globalized, i18n and l10n began to demand that we abandon English words to describe GUI actions. I believe this was the correct decision.
So I viewed icons+text in a GUI as transitional, a learning period. I learned to recognize the icons for what they represent, the skeumorphism of it all, while the icons plus text were displayed on my screen (I mostly used Ubuntu and KDE.)
When the time was right, I toggled the setting to remove the text altogether. I was ready to spread my wings and fly through an icon-rich UX that no longer relied on words or text to convey rich meanings.
Now this means that I can make mistakes or I need to experiment in terms of "What does THIS button do?!" in order to learn new icon vocabulary. But I definitely prefer it now to the English text, and really if anyone uses a modern device proficiently, they've picked up a decent vocabulary for this already.
Many people neglect to take into account that, for example, an app in an app store is represented more strongly by its chosen icon and branding than its title! Google Play lets us search by name, but that is not very efficient when I'm searching for a particular icon by description!
It also presents challenges to verbal tech support, whether by phone, by Meet, or over the shoulder, because how do you describe graphical icons and gestures? How to translate them back into English words? UI/UX designers have a hidden vocabulary for controls and widgets that aren't common knowledge, but can really enhance understanding once we can name them the same way the pros do.
Yes, icons can be used badly. Yes, people who are still learning may lean on text labels too. But graphical UIs are mature and we all need to acquire a vocabulary for these, so that information is conveyed across national and linguistic barriers.
The comments section on the author's book 'How to not be wrong' is one of the best things I have read in ages. I am so glad the author left it public. Imagine releasing a book called 'How to not be wrong' and you have like 200 people telling you that you are wrong. Posting in the comments section of your minimal personal blog.
agree that this is a protocol-level issue, not framework-specific. but the "all external tool calls require confirmation prompts" mitigation doesn't really apply here - the exfil happens without any tool call.
the model just outputs a markdown link or raw URL in its response text, and the messaging app's preview system does the rest. there's no "tool use" to gate behind a confirmation. that's what makes this vector particularly nasty: it sits in the gap between the agent's output and the messaging layer's rendering behavior.
neither side thinks it's responsible. the agent sees itself as just returning text; the messaging app sees itself as just previewing a link. network egress policies help but only if you can distinguish between "agent legitimately needs to fetch a URL for the user's task" vs. "agent was injected into constructing a malicious URL."
that distinction is really hard to make at the network layer.
the unfurling vector is elegant because it exploits a feature that predates LLMs entirely, link previews were designed for human-shared URLs where the sender is trusted.
once an LLM is generating the message content, the trust model breaks completely: the "sender" is now an entity that can be manipulated via indirect prompt injection to construct arbitrary URLs with exfiltrated data in query params.
the fix isn't just disabling previews, it's that any agent-to-user messaging channel needs to treat LLM-generated URLs as untrusted output and strip or sandbox them before rendering. this is basically an output sanitization problem, same class as XSS but at the protocol layer between the agent and the messaging app.
the fact that Telegram and Slack both fetch preview metadata server-side makes this worse - the exfil request happens from their infrastructure, not the user's device, so client-side mitigations don't help at all.
So many comments but i dont see anyone mentioning llm to replicate Discord, or others Twitter, Facebook. If claude can create a C compiler this would be trivial. And demonstrate the actual real world benefits of AI.
that's the user-facing definition but the implementation distinction matters more.
"takes longer than you're willing to wait" describes the UX, not the architecture. the engineering question is: does the system actually free up the caller's compute/context to do other work, or is it just hiding a spinner?
nost agent frameworks i've worked with are the latter - the orchestrator is still holding the full conversation context in memory, burning tokens on keep-alive, and can't actually multiplex. real async means the agent's state gets serialized, the caller reclaims its resources, and resumption happens via event - same as the difference between setTimeout with a polling loop vs. actual async/await with an event loop.
Woah, like, "green Acura"? Where are you driving that you ever see one of those? In the USA?
I looked up an image catalog, and it seems that if an Acura's going to be green, it is likely to be an NS-X, which are fairly exotic as Acuras go.
I owned a black Integra for a while. If any had been green in those days, I would've definitely noticed! And, I definitely would've yielded the right-of-way to them, just so I could gawk and stare!
Hi, nice work, its pretty simple and easy to use. I've often found new artists from one-off song collaborations eg Devonte Hynes from Solange's single Losing You. No connection between them is indicated from my minimal use of the site. But that's from the data feed (MusicBrainz) from what I gather, and adding such functionality may increase complexity. Would love to get a quick playlist from all the nodes in a neighbourhood so I could listen quickly. Keen to see where it goes.
Honestly, we could make a case that witch-hunting and persecutions were pro-science.
Witches and "cunning folk" are people who use metaphysical, occult means of spirituality to influence people and the environment. Witchcraft does not appear, to the dispassionate observer, to be scientific, evidence-based, or empirical at all.
Christendom was in the business of supporting many liberal arts and sciences, in terms of architecture, literacy, mathematics, chemistry/alchemy, exploration, R&D and building of war materiels and sea vessels.
It was the witches who were meddling and using occult means, such as divination, augury, psychological manipulation, and treachery to achieve their ends. It was witchcraft that was chaotic and working against orderly scientific inquiries about the natural world.
The witches were typically working clandestinely, stereotypically in a little hut in the woods, in the shadows. The scientists of Christendom were founding and running universities, seminaries, hospitals, and other institutional centers of learning. They were highly organized and orderly endeavors, and witchcraft threatened the natural and political order of things, which seems to be what science represents, even to the present day.
This is exactly the missing piece. I've been copying code back and forth between terminal and editor, mentally tracking which lines need changing. The diff-as-conversation model makes so much more sense. Will give this a shot with my workflow.
"background job" is actually the more honest framing.
the interesting design question you're pointing at, what happens when it wants attention, is where the real complexity lives. in practice i've found three patterns:
(1) fire-and-forget with a completion webhook
(2) structured checkpointing where the agent emits intermediate state that a supervisor can inspect
(3) interrupt-driven where the agent can escalate blockers to a human or another agent mid-execution.
most "async agent" products today only implement (1) and call it a day. But (2) and (3) are where the actual value is, being able to inspect a running agent's reasoning mid-task and course-correct before it burns 10 minutes going down the wrong path.
the supervision protocol is the product, not the async dispatch.
Fiat currency and gov spending has turned the economy unproductive while at the same time making all the numbers go up. It will inevitably collapse in a tragic way. Join the looting or take up surfing and live in SEA, hope you don't have any ambition or notion of human development/excellence.
Legit. I think people spend too much on planning instead of focusing on getting to work and closer to the finished output. If you are fixated on a plan... good luck dealing with situations that throw you off piste.
It's not just our right but our duty.