"Wrong" does not necessarily mean "against the standard". It means "against common usage and good team practice" in this context.
It's "allowed" to use raw pointers, malloc, and any number of things in C++ code. In general, if you do any of them in a modern codebase you're doing it wrong.
Yes, it's obviously "against common usage" given HTML support exists specifically for less common features that Markdown does not support. Like tables, which are supported by some implementations but not all, and iirc not even all Markdown variants that support tables use the same syntax for them. The only way to be 100% sure is to use HTML. Of course you wouldn't do that if you just have the file on Github, but in general HTML is supported in Markdown for a reason.
In an agentic loop, the model can keep calling multiple tools for each specialized artifact (like how claude webapp renders HTML/SVG artifacts within a single turn). Models are already trained for this (tested this approach with qwen 3.5 27B and it was able to follow claude's lead from the previous turns).
There is nothing wrong with big money backing, often is necessary for long term bets, but rug pulling is a serious threat. VC funded open source has become a pattern/playbook.
It would have been fine if the Astral team was acqui-hired and uv, ruff, etc were donated to the PSF or Linux Foundation for further sponsorship and support.
But the pressure because they raised VC funding, I would imagine Astral needed an actual exit and OpenAI saw Astral's tools as an asset.
The author doesn't explain (or is ignorant about) why this happens. These are special tokens that the model is trained on, and are part of its vocab. For example, here are the <think> and </think> tokens defined in the [Qwen3 tokenizer config](https://huggingface.co/Qwen/Qwen3-235B-A22B-Thinking-2507/bl...).
The model runtime recognizes these as special tokens. It can be configured using a chat template to replace these token with something else. This is how one provider is modifying the xml namespace, while llama.cpp and vllm would move the content between <think> and </think> tags to a separate field in the response JSON called `reasoning_content`.
I only did it once some 15 years back (in a happy memory) using LFS. It took about a week to get to a functional system with basic necessities. A code finetuned model can write a functional chat UI with all common features and a decent UX in under a minute.
Gemma 3 4B (QAT quant):
Yes, Paul Newman was indeed known to have struggled with alcohol throughout his life. While he maintained a public image of a charming, clean-cut star, he privately battled alcoholism for many years. He sought treatment in the late 1980s and early 1990s and was reportedly very open about his struggles and the importance of seeking help.
^^ I have vague memories of Clippy, but I remember it as obnoxious, often consuming the precious screen real estate on the low res monitors of the day, without offering anything valuable. But tooltips on the web with CSS and libraries like floating-ui can be much more compact, agile and barely noticeable.
I have tried showing help text in an internal app using tooltips when the user would hover over the target element (or show a small icon on touch devices), and the feedback was good as the tooltips were never in the way but easily available for help (accessibility for keyboard users needed some thinking, but for the limited audience for that app, it was not a problem). And while you're at it, may be make it more engaging than a simple text only tooltip (which can be done without any intelligence), and let the host to customize and offer complex workflows.
No it won't (most likely). VTracer (which the authors compare with) is fast, runs in browser via wasm, consumes way less resources and can even convert natural images very decently.
But the model seems cool for the usecase of prompt to logo or icon (over my current workflow of getting a jpg from flux and passing it through VTracer). I hope someone over at llama.cpp notices this (at least for the text-to-svg usecase, if not multimodal).
Author of VTracer here. Finally being able to comment on hackernews before the thread got locked.
Would be interested in learning about your workflow. Is it a logo generation app?
I feel like this is an example of "Machine learning is eating software". Raster to vector conversion is a perfect problem, because we can generate dataset of infinite sizes and can easily validate them with vectorize-rasterize roundtrips.
I did have an idea of performing tracing iteratively. Basically by adjusting the output SVG bit-by-bit until it matches the original image within a certain margin of error. And optimizing the output size of the SVG by simplifying curves if it does not degrade the quality. But VTracer in its current state is oneshot and probably uses 1/100 of the computational resources.
VTracer seems to perform badly on all the examples. I suspect it can be drastically improved simply by upscaling the image (via traditional interpolation, or machine learning based) and picking different parameters. But I am glad that it was cited!
Thanks for noticing this, and yes I have also noticed what you're pointing out, but workable for many use cases. I use this workflow for making images for marketing or web (so images are more artistic than photo realistic generations to begin with). Think of stuff you can find on undraw, but generated by image models from prompts. Then run them through VTracer. The reproductions are not perfect, but are often good enough (can be slow depending on how sharp you want the curves, and often very large file sizes as you mentioned). Then make any changes in inkscape and convert back to raster for publishing.
> logo generation app
For logo generation, I would actually prefer code gen. I thought of this problem when reading about the diffusion language models recent (if there is lots of training data available in form of text-vector-raster triplets).
reply