use AI to rewrite all the spells from all the books, then try to see if AI can d...

gbalduzzi · 2026-02-06T06:35:18 1770359718

Neat idea, but why should I use AI for a find and replace?

It feels like shooting a fly with a bazooka

jack_pp · 2026-02-06T08:08:23 1770365303

it's like hiring someone to come pick up your trash from your house and put it on the curb.

it's fine if you're disabled

miohtama · 2026-02-06T06:51:12 1770360672

Bazooka guarantees the hit

xenodium · 2026-02-06T07:46:41 1770364001

I like LLMs, but guarantees in LLMs are... you know... not guaranteed ;)

throwaway290 · 2026-02-06T08:20:42 1770366042

I think that was the point

imafish · 2026-02-06T09:55:50 1770371750

If all you have is a hammer.. ;)

luckydata · 2026-02-06T08:12:29 1770365549

do you know all the spells you're looking for from memory?

wickedsight · 2026-02-06T08:57:05 1770368225

You could just, you know, Google the list.

Applejinx · 2026-02-06T10:41:36 1770374496

and then the first thing you see will be at least one of ITS AI responses, whether you liked it or not

bilekas · 2026-02-06T08:15:25 1770365725

You're missing the point, it's only a testing excersize for the new model.

happyraul · 2026-02-06T08:38:22 1770367102

No, the point is that you can set up the testing exercise without using an LLM to do a simple find and replace.

kakacik · 2026-02-06T10:51:49 1770375109

Its a test. Like all tests, its more or less synthetic and focused on specific expected behavior. I am pretty far from llms now but this seems like a very good test to see how geniune this behavior actually is (or repeat it 10x with some scramble for going deeper).

inexcf · 2026-02-06T14:51:40 1770389500

This thread is about the find-and-replace, not the evaluation. Gambling on whether the first AI replaces the right spells just so the second one can try finding them is unnecessary when find-and-replace is faster, easier and works 100%.

bilekas · 2026-02-06T09:14:26 1770369266

... I'm not sure if you're trolling or if you missed the point again. The point is to test the contextual ability and correctness of the LLMs ability's to perform actions that would be hopefully guaranteed to not be in the training data.

It has nothing to do about the performance of the string replacement.

The initial "Find" is to see how well it performs actually find all the "spells" in this case, then to replace them. They using a separate context maybe, evaluate if the results are the same or are they skewed in favour of training data.

LeoPanthera · 2026-02-06T07:45:51 1770363951

That won't help. The AI replacing them will probably miss the same ones as the AI finding them.

steve1977 · 2026-02-06T09:11:57 1770369117

I think the question was if it will still find 49 out of 50 if they have been replaced.