I think the website should probably mention those installation preset in unsloth pyproject.toml though. The website instruct you to install dependencies separately. But it turns out there are dedicated preset that install specific rocm/cuda/xpu version in the project.
Uv helps you up though. Use a pyproject.toml and uv sync. Everything will be put into the venv only, nothing spread across the whole system.
The pyproject.toml can even handles build env for you, so you no longer need a setup.sh that installs 10 tool in specific order with specific flag to produce working environment. A single uv sync, and the job is done.
Plus the result is reproducible, so if this time uv sync work, then it also work next time.
Highly recommend if you are still on pip.
Note: Take a example that I used to install unsloth with rocm setup that based on unreleased git version dependencies and graphic card specific build flag, all of them can be handled with one command 'uv sync'. This will require a big pile of shell script if doing another way. https://github.com/unslothai/unsloth/issues/4280#issuecommen...
I think part of the problem is how these mcp service are designed. A lot of them just returns Mbs of text blob without filtering at all, and thus explodes the context.
And it's also affected by how model is trained. Gemini specifically like to read large amount of text data directly and explodes the context. But claude try to use tool for partial search or write a script to sample from a very large file. Gemini always fills the context way faster then claude when doing the same job.
But I guess in case of a bad designed mcp, there is no much model can do because the results are injected into context directly though (unless the runtime decided to redirect it to somewhere else)
Tell me how many ways that print help message for a command you have seen and say "reusable" again. Mcp is exactly exists to solve this. The rest is just json rpc with simple key value pairs.
You can probably let llm guess the help flag and try to parse help message. But the success rate is totally depends on model you are using.
If it explore all these cases after a few month and made the tool itself obsolete, that sounds like a total win to me?
However that don't happen unless firefox just stop developing though. New code comes with new bug, and there must be some people or some tool to find it out.
It's not really bad or not though. It's a more directed than the rest fuzzer. While being able to craft a payload that trigger flaw in deep flow path. It could also miss some obvious pattern that normal people don't think it will have problem (this is what most fuzzer currently tests)
Wondering how it decided to show the force exit program dialog. I used to use 8g macbook for development. But instead warning on serious memory exhaustion, it just decided to lag and suicide with everything freezed (including the restart button).
from my personal experience, qwen 30b a3b understand command quiet well as long as the input is not big enough that ruin the attention (I feel the boundary is somewhere between 8000 or 12000?). But that isn't really bug of model itself though. A smaller model just have shorter memory, it's simply physical restriction.
I made a mixed extraction, cleaning, translation, formatting task on job that have average 6000 token input. And so far, only 30b a3b is smart enough not miss job detail (most of time)
I later refactor the task to multi pass using smaller model though. Make job simpler is still a better strategy to get clean output if you can change the pipeline.
reply