I find the mistral "middle" between small LMs /1T LMs compelling. Models that are sufficiently big to be performant but specialised for domains and tasks- this is what I assumed we'd always head towards.
Not sure I agree with this. MD files need to be constantly synced to code state- why not just grep the code files? This is just more unstructured indexing
yeah my teammates seem to enjoy checking in endless walls of MD texts of "documentation" generated by llms after it's done adding a feature. So even if that's an extreme and your documentation is more thoughtful, there is still a problem of:
* redundancy with the code: if code samples can be generated from the code, why bother duplicating them? what do they add? can they not be llm-generated later? and possibly kept somewhere out of the way (like, a website) so as not to clutter the codebase with redundancy
* if you do go for this duplication, then you are on the hook for ensuring it's always up-to-date otherwise it becomes worse than duplicate: misleading
So my preference is, when adding something to the repo, think very hard whether this information is redundant or not. Handcrafted docs, notes, comments that add more context like why was this built that way after a ton of deliberation - yes. Anything that is trivially derived from the code itself - no.
I've been trying to push people to use hitchstory or similar to generate docs from specification tests precisely to avoid that redundancy but most people just look blankly at it and go "why don't you just do that with AI?"
Grepping works when you wrote the code. Not so much when someone else installs your package and has no idea which export is public API. We added a one-page markdown saying "use these, ignore the rest" and the wrong-import issues mostly stopped.
reply