By the same reasoning, why on earth would a person sincerely ask you that question unless the car that they want to wash is either already at the car wash, or that someone is bringing it to them there for some reason?
If it's as unambiguous as you say, then the natural human response to that question isn't "you should drive there". It's "why are you fucking with me?" Or maybe "have you recently suffered a head injury?"
If you trust that the questioner isn't stupid and is interacting with you honestly, you'd probably just assume that they were asking about an unusual situation where the answer isn't obvious. It's implicitly baked into the premise of the question.
That still doesn't make sense. I'm going to use another car, or borrow a car to drive to a carwash where my car I want to wash is and then....I guess leave it there? Or leave the car I came in?
This isn't a viable out for explaining why AI can't "reason" through this.
But why would they reason through it in that way? You haven't asked them to listen carefully and find the secret reason you're a dumb-ass in order to prove how smart they are. If they default to that mode on every query, that would just make them insufferable conversational partners, which is not the training goal.
Let me put it this way. If you were to prefix the prompts they used with "This is an IQ test: ", I wouldn't be surprised if most of the the models did much better. That would give them the context that the humans reading this article already have.
You already brought the car there earlier? You bought a new car and negotiated that you get it washed, so you want to collect it? You have a butler? You plan to get someone or something from the car wash to do it at home, because the car you want to wash is dead?
Yeah similarly we can make a few distinctions here:
1) Intended signal, true
2) Unintended signal, but true
3) Unintended signal, but false
(Sure, 1' intended but false; though not really important here)
When (1) obtains we can describe this situation as one where sender and received coordinate on a message.
When (2) obtains we can say the sender acted in a way that indicative of some fact or other and the received is recognizes this; (2) can obtain when one obtains as a separate signal or when the sender hasn't intended to send a signal.
(3) obtains when the receiver attributes to the sender some expressive behavior or information that is inaccurate, say, because an interpretive schema has characterized the sender and the coding system incorrectly producing an interpretation that is false.
Also remember that each recipient of the signal will have their own reaction to it. What signals professional competence to one person can signal lickspittle corporate toadying to another.
It cannot be overstated how absurd the marketing campaign for AI was. OpenAI and Anthropic have convinced half the world that AI is going to become a literal god. They deserve to eat a lot of shit for those outright lies.
Just because Antropic made you think they are doing very complex thing with this tool, doesn't mean it is true. Claude Code is not even comparable to massive software which is probably an order of magnitudes more complex, such as IntelliJ stuff as an example.
Tools like https://github.com/badlogic/pi-mono implement most of the functionality Claude Code has, even adding loads of stuff Claude doesn't have and can actually scroll without flickering inside terminal, all built by a single guy as a side project. I guess we can't ask that much from a 250B USD company.
reply