> but if we manage to find a way to hit 70%, it would be better.
Yet still absolutely worthless.
> "correct for most part, but could be better" concept something.
When humans do that we just call it "an error."
> so lets call that "correctness" or something
The appropriate term is "confidence." These LLM tools all could give you a confidence rating with each and every "fact" it attempts to relay to you. Of course they don't actually do that because no one would use a tool that confidently gives you answers based on a 70% self confidence rating.
We can quibble over terms but more appropriately this is just "garbage." It's a giant waste of energy and resources that produces flawed results. All of that money and effort could be better used elsewhere.
> These LLM tools all could give you a confidence rating with each and every "fact" it attempts to relay to you. Of course they don't actually do that because no one would use a tool that confidently gives you answers based on a 70% self confidence rating.
Why do you believe they could give you a confidence rating? They can't, at least not a meaningful one.
Depends on the context, doesn't it? Nothing is usually 100% worthless or 100% "worthy", there are grey areas in life where we're fine with "kind of right, most of the time". Are you saying these scenarios absolutely never exists in your world? I guess I'd be grateful if my life was so easy always.
and even those confidence ratings are useless, imo. If trained with wrong data, it will report high confidence for the wrong answer. And curating a dataset is a black art in the first place
Yet still absolutely worthless.
> "correct for most part, but could be better" concept something.
When humans do that we just call it "an error."
> so lets call that "correctness" or something
The appropriate term is "confidence." These LLM tools all could give you a confidence rating with each and every "fact" it attempts to relay to you. Of course they don't actually do that because no one would use a tool that confidently gives you answers based on a 70% self confidence rating.
We can quibble over terms but more appropriately this is just "garbage." It's a giant waste of energy and resources that produces flawed results. All of that money and effort could be better used elsewhere.