Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>Needle in a Needlestack is a new benchmark to measure how well LLMs pay attention to the information in their context window

I asked GPT-4o for JavaScript code and got Python, so much for attention.



What was your query?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: