Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Perhaps "one model to rule them all" isnt the best approach.


There's probably a huge amount of room for improvement in the RLHF process. If there is still low hanging fruit, it would have to be there.


"I dunno" would have to be marked as a good or neutral response in the RLHF process, and that seems like a problematic training incentive.


In an ideal world "I don't know" would be considered worse than a correct answer but much better than a wrong answer.

In the UK, there is a competition called the "junior maths challenge", or something, which is a multiple choice quiz where correct answers are +1 and incorrect answers are -6 (so guessing has negative EV). I think we need a similar scoring system here.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: