I'd be interested to see other examples of Lexis+ AI failures, because I thought the example answer shown in the article appeared to be the "least wrong" one given that the answer was correct until Dobbs was decided less than 2 years ago.
I'm not in the LLM training space, but I'd also think that that kind of failure would be the easiest to fix, i.e. instead of starting out with "Currently...", start out with "As of <date>..."
Thanks very much for linking that. This makes me think that legal support is actually one of the worst possible uses of LLM-based AI (at least as implemented here), primarily because so much of the source material is directly contradictory, e.g. a legislature passes a law which is subsequently overturned by the courts, or decisions in lower courts are reversed by higher courts, or higher courts reverse themselves over time. It feels like you'd absolutely have to annotate all the source material in some way to say whether it was still controlling law/precedent.
Your instinct is correct. The major legal research providers (Thomson Reuters and LexisNexis) both provide “citators”, which are human annotations of which cases and statutes have been overruled, upheld, criticized, etc. One of the issues the paper describes is the fairly ham-handed way this gets integrated into these systems, causing even more trouble.
Pretty much the same, when I try to get a LLM output correct code targeting a certain libary, but in its training data are various conflicting versions of the libary and the result is a incompatible mix composed of code for different versions thrown together.
Is there actually an effective way to handle queries with RAG where time periods are relevant? I made a proof-of-concept RAG for documents on a government website shortly after GPT-3.5 came out and remember this being a big problem. The most glaring wrong answer was "Who is currently the governor?" It answered with the previous governor, likely because he was listed as such in 8 years of documents versus the 2 years at the time for the current governor.
I'm not in the LLM training space, but I'd also think that that kind of failure would be the easiest to fix, i.e. instead of starting out with "Currently...", start out with "As of <date>..."