Hacker Newsnew | past | comments | ask | show | jobs | submit | mmis1000's commentslogin

I think the website should probably mention those installation preset in unsloth pyproject.toml though. The website instruct you to install dependencies separately. But it turns out there are dedicated preset that install specific rocm/cuda/xpu version in the project.

Uv helps you up though. Use a pyproject.toml and uv sync. Everything will be put into the venv only, nothing spread across the whole system.

The pyproject.toml can even handles build env for you, so you no longer need a setup.sh that installs 10 tool in specific order with specific flag to produce working environment. A single uv sync, and the job is done.

Plus the result is reproducible, so if this time uv sync work, then it also work next time.

Highly recommend if you are still on pip.

Note: Take a example that I used to install unsloth with rocm setup that based on unreleased git version dependencies and graphic card specific build flag, all of them can be handled with one command 'uv sync'. This will require a big pile of shell script if doing another way. https://github.com/unslothai/unsloth/issues/4280#issuecommen...


I think part of the problem is how these mcp service are designed. A lot of them just returns Mbs of text blob without filtering at all, and thus explodes the context.

And it's also affected by how model is trained. Gemini specifically like to read large amount of text data directly and explodes the context. But claude try to use tool for partial search or write a script to sample from a very large file. Gemini always fills the context way faster then claude when doing the same job.

But I guess in case of a bad designed mcp, there is no much model can do because the results are injected into context directly though (unless the runtime decided to redirect it to somewhere else)


Tell me how many ways that print help message for a command you have seen and say "reusable" again. Mcp is exactly exists to solve this. The rest is just json rpc with simple key value pairs.

You can probably let llm guess the help flag and try to parse help message. But the success rate is totally depends on model you are using.


Most CLIs use `--help`, any other are just plain hostile to the users.

`-h` is also popular, but there is also possible issue of that shorthand, hence `--help`.


some come with only very short description but most part are only discoverable by 'man'.

and windows mostly use \? and also \h,

java user single - for long argument because it don't have short one.

I doubt it is ever close to reusable.

And even allowed position of parameters (or even meaning of arguments in case of ffmpeg) are program dependent.

Some allow anywhere as long as it is started with a dash, some only allow before first input


If it explore all these cases after a few month and made the tool itself obsolete, that sounds like a total win to me?

However that don't happen unless firefox just stop developing though. New code comes with new bug, and there must be some people or some tool to find it out.


It's not really bad or not though. It's a more directed than the rest fuzzer. While being able to craft a payload that trigger flaw in deep flow path. It could also miss some obvious pattern that normal people don't think it will have problem (this is what most fuzzer currently tests)


> Only complaint is it sometimes decides to ignore half your prompt when instructions get long

This sounds like your context is too big and getting cut off.


Wondering how it decided to show the force exit program dialog. I used to use 8g macbook for development. But instead warning on serious memory exhaustion, it just decided to lag and suicide with everything freezed (including the restart button).


I think AMD just add support of rocm to rdna2 recently? I can run torch and aisudio with it just fine.

They also finally fix all ai related stuff building on windows, so you are no longer limited to linux for these.


from my personal experience, qwen 30b a3b understand command quiet well as long as the input is not big enough that ruin the attention (I feel the boundary is somewhere between 8000 or 12000?). But that isn't really bug of model itself though. A smaller model just have shorter memory, it's simply physical restriction.

I made a mixed extraction, cleaning, translation, formatting task on job that have average 6000 token input. And so far, only 30b a3b is smart enough not miss job detail (most of time)

I later refactor the task to multi pass using smaller model though. Make job simpler is still a better strategy to get clean output if you can change the pipeline.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: