More

lifthrasiir · 2026-06-25T04:50:48 1782363048

Oh wow, seriously, I always thought Lua should have been like this. The 5.1/5.2/5.3+ split was so painful.

> My ultimate goal was to support LuaJIT in Rust as well but this does not make it easier.

I think you could stop right before the syntax extension.

lifthrasiir · 2026-06-22T19:10:56 1782155456

The total model size is about 1.2GB (UNet + SDXL VAE included), so probably about ~3GB?

lifthrasiir · 2026-06-22T19:05:33 1782155133

Tried a bit, and while it is very impressive for 0.2B model it would be very hard to convince me that this matches with 10B models. It did work reasonably well with natural images but inpainted regions were visibly smoother than surroundings, and performed very badly on novel objects. It is also limited to 512x512 output, which limits its practical usefulness.

amelius · 2026-06-22T21:36:29 1782164189

Do you think the provided examples are representative of its performance, or do you think they were cherry picked?

lifthrasiir · 2026-06-22T21:41:13 1782164473

Given its limited output dimension it's hard to tell. I haven't exactly tested fine-tuned variants but I think they would work well under certain situations. After all, some (possibly cherry-picked) examples still exhibit similar problems when you inspect them in detail.

lifthrasiir · 2026-06-22T10:54:09 1782125649

I had yet another reason to use inline assemblies in WAH [1]. In some compilers (I think it was GCC) intrinsics are imbued with `target("foo")` attributes, which cause forced-inline via `always_inline` attribute to fail somehow. I really needed that though because I was writing a fast bytecode compiler and being unable to force-inline meant each supported SIMD instruction had to pay function call overhead (which can be significant when your bytecode is literally a single native SIMD operation!).

[1] https://github.com/lifthrasiir/wah/

mananaysiempre · 2026-06-22T11:57:06 1782129426

I’m having vague flashbacks here so I might be guessing wrong, but: the only reason I’ve ever seen an “inlining failed” error from GCC when using intrinsics (and it was an actual error) was when it actually meant “you can’t use this intrinsic in this configuration” (the second half of the message, “target specific option mismatch”, is more helpful if still cryptic). Thus the fix was to change the argument of the -march= option or (for dynamic dispatch) decorate the caller with the correct __attribute__((target)). E.g. if you pass -march=x86-64-v1 but try to use AVX you’ll get such an inlining error. (This is unlike MSVC which will always allow you to use any intrinsic supported by the compiler.)

lifthrasiir · 2026-06-22T12:46:28 1782132388

I think you are right in the interpretation. In my case though dynamic dispatch was already integrated with bytecode lowering so the error wasn't helpful :(

mananaysiempre · 2026-06-22T13:14:38 1782134078

Ah, so you want the function containing the interpreter loop to be compiled for the baseline architecture but some of the bytecode implementations inside it to use more advanced intrinsics? Yeah, I don’t think GCC has a good answer to that one. It also sounds gnarly from a general calling-convention perspective—how is the VZEROUPPER on exit supposed to be emitted if you can’t count on AVX?..

We recently (finally) got __attribute__((musttail)) in GCC[1], I’ve just tried it between functions with mismatched __attribute__((target))s and it does work, so theoretically you could code your interpreter that way. But it seems like you’re still bound to keep loading and storing vector state from and to memory and VZEROUPPERing your registers after each bytecode, and that doesn’t sound like a particularly good time.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119616

lifthrasiir · 2026-06-22T08:49:16 1782118156

> I have never written a bug that has destroyed my users' hardware, ...

Probably whoever (human or agent) originally decided to put TRACE logs into SQLite also thought---or reasoned---so. Maybe the decision was right at that time but the amount of TRACE logs have increased enormously. You will never know.

applfanboysbgon · 2026-06-22T09:03:34 1782119014

I love that we've moved the goalposts from "LLMs are better than artisanal software engineers" to "actually, shipping hardware-destroying bugs in production is literally unavoidable, nobody could possibly avoid doing it".

lifthrasiir · 2026-06-22T09:11:12 1782119472

I only meant what I said. After all the OP's thesis was that LLMs aren't better than artisanal software engineers, are they? There was no goalpost to move at least in this particular thread. And the solution might be another agent monitoring those oft-ignored signals.

lifthrasiir · 2026-06-21T09:10:42 1782033042

Intents matter. Compilers can't see through your skull to infer your intents and thus behave very conservatively unless you override that behavior somehow. This inference, alas, also takes (much) time, so compilers have to balance the compilation time with quality of intents guessed as well. (This is why we can't exactly use LLMs in mainstream compilers, by the way.) So go and make a programming language that preserves your intents by every means; but making it practical would be very difficult.

lifthrasiir · 2026-06-18T13:03:28 1781787808

The vast majority of that compute is locked in AI accelerators that do the inference. Those hardwares are bad at doing anything other than that---in fact crawlers would need more residential proxies than more computes in that regard.

lifthrasiir · 2026-06-12T19:51:59 1781293919

I have seriously attempted to write my own WebAssembly 3.0 implementation recently, and while I did finish the whole thing [1] that left me a bitter taste about WasmGC which turned out to be very annoying to implement. In fact, I originally wanted to avoid GC but spectest assumed that GC is always available and I had no other option but implementing one in order to make use of spectest in the first place.

[1] https://github.com/lifthrasiir/wah/

panick21_ · 2026-06-12T22:15:40 1781302540

Interfacing with GC is usually hard, how should have it been done?

lifthrasiir · 2026-06-13T04:04:38 1781323478

Of course, but I'm talking about "annoyance". GC type system is especially annoying if you are not writing the full compiler.

lifthrasiir · 2026-06-07T10:28:22 1780828102

While this has been downvoted to the death, it is fun to guess how many entries are submitted to each IOCCC. My best guess is around 10^2.5, i.e. 3--400. Rationales:

- The number of winning entries and losing entries that get revealed later in public suggests that this number should be at least 50.

- The number of judging rounds, as the FAQ says, is at least 3 and possibly more. If each judging round eliminates about a half of entries, we should expect at least 10 submissions per each winning entries. I personally think the actual elimination rate can be as low as 1--20% at the end, but at least first few rounds should be easy so I think this is a good minimum guess: 1--200.

- The current number of individual judges is just enough for the three-digit number of submissions. It has a striking resemblance with typical academic conferences with typical acceptance rate, by the way! If there were thousands of submissions (like today's AI conferences...) there ought to be much more judges, and more importantly, more levels of judges so that each judge can do just enough work throughout the entire process. So this establishes the maximum guess: 1,000.

- My best guess is simply a geometric mean of two extrema.

lifthrasiir · 2026-06-07T10:07:06 1780826826

Just two months ago I tried to write a short K code with Claude Opus 4.6, only to find that while it had sufficient knowledge about K vocabularies it didn't try to make good use of them. K is, while slightly obscure and obfuscated, a real programming language and certainly better known than obfuscated programming. I don't have high hope for IOCCC-grade obfuscation.