Hacker Newsnew | past | comments | ask | show | jobs | submit | dwaltrip's commentslogin

Please don’t post slop as a comment.

You left out the first half of the prompt: “I want to wash my car”.

Yeah I see this argument being made that it’s ambiguous for humans. Uh, no? Why on earth would I walk to the car wash when I want to wash my car?

By the same reasoning, why on earth would a person sincerely ask you that question unless the car that they want to wash is either already at the car wash, or that someone is bringing it to them there for some reason?

If it's as unambiguous as you say, then the natural human response to that question isn't "you should drive there". It's "why are you fucking with me?" Or maybe "have you recently suffered a head injury?"

If you trust that the questioner isn't stupid and is interacting with you honestly, you'd probably just assume that they were asking about an unusual situation where the answer isn't obvious. It's implicitly baked into the premise of the question.


The fact that this is so obvious to humans is why there's no training data that LLMs can use to know the answer.

How could the car already be at the car wash if you have the option to drive it there?

You might own multiple cars, you might be borrowing someone elses and so forth.

That still doesn't make sense. I'm going to use another car, or borrow a car to drive to a carwash where my car I want to wash is and then....I guess leave it there? Or leave the car I came in?

This isn't a viable out for explaining why AI can't "reason" through this.


But why would they reason through it in that way? You haven't asked them to listen carefully and find the secret reason you're a dumb-ass in order to prove how smart they are. If they default to that mode on every query, that would just make them insufferable conversational partners, which is not the training goal.

Let me put it this way. If you were to prefix the prompts they used with "This is an IQ test: ", I wouldn't be surprised if most of the the models did much better. That would give them the context that the humans reading this article already have.


You already brought the car there earlier? You bought a new car and negotiated that you get it washed, so you want to collect it? You have a butler? You plan to get someone or something from the car wash to do it at home, because the car you want to wash is dead?

It’s quite a difference…

The expected or assumed signal can differ radically from the perceived signal, often in surprising ways.

People spend so much energy doing things based on untrue assumptions about what others are thinking.

And this is before we even get into how much one should adjust their behavior based on someone else’s perception.


Yeah similarly we can make a few distinctions here: 1) Intended signal, true 2) Unintended signal, but true 3) Unintended signal, but false (Sure, 1' intended but false; though not really important here)

When (1) obtains we can describe this situation as one where sender and received coordinate on a message.

When (2) obtains we can say the sender acted in a way that indicative of some fact or other and the received is recognizes this; (2) can obtain when one obtains as a separate signal or when the sender hasn't intended to send a signal.

(3) obtains when the receiver attributes to the sender some expressive behavior or information that is inaccurate, say, because an interpretive schema has characterized the sender and the coding system incorrectly producing an interpretation that is false.


Also remember that each recipient of the signal will have their own reaction to it. What signals professional competence to one person can signal lickspittle corporate toadying to another.

Yes, but in aggregate, most people (or most groups of people) will arrive at the same conclusion for the same signal.

Else signals and signalling wouldn't be a thing and people wouldn't care for them, their reception would be a random scatter plot.


What tools do you use for wireframes / how are you generating them?

Much better sources of iron are available.

More likely we get smooshed unintentionally as they AIs seek those out.


We need it all... oh, wait, you're not silicon... sluuuuuurrrrpp...


Wow yeah very prescient.


It’s clear enough but they aren’t going out of their way to make it obvious. It’s definitely fluffed up / corporately sanitized.


Damage control to limit the rush for the exits?


The CEOs aren't here in the comments.


Which is why we ought to always bring up their BS every time people try to pretend it didn't happen.

The promises made are ABSOLUTELY relevant to how promising or not these experiments are.


I bet you get upset when you buy a new iPhone and don't love it, because Tim Cook said on the ad that they think you're going to love it.


It cannot be overstated how absurd the marketing campaign for AI was. OpenAI and Anthropic have convinced half the world that AI is going to become a literal god. They deserve to eat a lot of shit for those outright lies.


Sure, it can be beneficial. But don't forget that externalities are a thing.


sips coffee… ahh yes, let me find that classic Dropbox rsync comment


Just because Antropic made you think they are doing very complex thing with this tool, doesn't mean it is true. Claude Code is not even comparable to massive software which is probably an order of magnitudes more complex, such as IntelliJ stuff as an example.

Tools like https://github.com/badlogic/pi-mono implement most of the functionality Claude Code has, even adding loads of stuff Claude doesn't have and can actually scroll without flickering inside terminal, all built by a single guy as a side project. I guess we can't ask that much from a 250B USD company.

Be careful with the coffee.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: