(I really need a macro for this comment, I keep repeating it :D ) Claude is a pa...

(I really need a macro for this comment, I keep repeating it :D )

Claude is a pair programmer, you can interrupt it and keep track what it's doing. It's VERY results-oriented, aiming to be "done" as fast as possible. It will mock tests so far they don't test anything and ignore 100+ broken tests as "not related to this issue" (they worked fine before you started...). Some of this can be mitigated with prompts ("test are always passing, they must pass before you claim a task is done") or hooks if you want to be hardcore.

Codex is an outsourced Indian development team. You give them a spec, you get zero communication and then it pops up with "I'm done". Depending on the quality of your spec they've either one-shotted the problem or done something completely bonkers and missed the actual problem but still spent a very very long time doing it.

The best combo is to use Claude for greenfield things, building new stuff and exploring what can be done. Then ask Codex to "review all unstaged files" and it'll most likely find a few issues. Give that report to Claude and ask "do you agree with this review?" and have it fix the ones all three agree (you, Claude and Codex).

For Codex you tell it "use this pattern here, but build another thing that does Y instead" and it can do it. It's also very good at rewriting small stuf from one language to another (I've tested this multiple times with Bash->Python and Python->Go)