Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What is wrong with "claude --chrome"?




the Claude --chrome command has a few limitations:

1. it exposes low-level tools which make your agent interact directly with the browser which is extremely slow, VERY expensive, and less effective as the agent ends up dealing with UI mechanics instead of thinking about the higher-level goal/intents

2. it makes Claude operate the browser via screenshots and coordinates-based interaction, which does not work for tasks like data extraction where it needs to be able to attend to the whole page - the agent needs to repeatedly scroll and read one little screenshot at the time and it often misses critical context outside of the viewport. It also makes the task more difficult as the model has to figure out both what to do and how to do it, which means that you need to use larger models to make this paradigm actually work

3. because it uses your local browser, it also means that it has full access to your authenticated accounts by default which might not be ideal in a world where prompt-injections are only getting started

if you actively use the --chrome command we'd love to hear your experience!


I am sure they measured the difference but i am wondering why reading screenshots + coordinates is more efficient than selecting aria labels? https://github.com/Mic92/mics-skills/blob/main/skills/browse.... the JavaScript snippets should at least more reusable if you want semi-automate websites with memory files

claude --chrome works, but as the OP mentions: they can do it 20x faster, by passing in higher-level commands.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: