Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

With all the models I tried there was a quite a bit of fiddling for each one to get the correct command-line flags and a good prompt, or at least copy-paste some command-line from HF. Seems like every model needs its own unique prompt to give good results? I guess that is what the wrappers take care of? Other than that llama.cpp is very easy to use. I even run it on my phone in Termux, but only with a tiny model that is more entertaining than useful for anything.


For the chat models, they're all finetuned slightly differently in their prompt format - see Llama's. So having a conversion between the OAI api that everyone's used to now and the slightly inscrutable formats of models like Llama is very helpful - though much like langchain and its hardcoded prompts everywhere, there's probably some subjectivity and you may be rewarded by formatting prompts directly.


The slight incompatibilities of prompt formats and style is a nuisance. I have just been looking an Mistral’s prompt design documentation and I now feel like I have underutilized mistral-7B and mixtral-8-7b https://docs.mistral.ai/guides/prompting-capabilities/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: