More

genpfault · 2026-02-19T18:55:53 1771527353

> "A system's purpose is what it does"

POSIWID: https://en.wikipedia.org/wiki/The_purpose_of_a_system_is_wha...

genpfault · 2026-02-09T03:01:36 1770606096

https://www.redblobgames.com/articles/curved-paths/

Ef996 · 2026-02-09T08:59:49 1770627589

Yes, I don't even know how I didn't know about this at the time of wiring the article. But a must read for sure!

genpfault · 2026-02-03T18:17:38 1770142658

Nice! Getting ~39 tok/s @ ~60% GPU util. (~170W out of 303W per nvtop).

System info:

    $ ./llama-server --version
    ggml_vulkan: Found 1 Vulkan devices:
    ggml_vulkan: 0 = Radeon RX 7900 XTX (RADV NAVI31) (radv) | uma: 0 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
    version: 7897 (3dd95914d)
    built with GNU 11.4.0 for Linux x86_64

llama.cpp command-line:

    $ ./llama-server --host 0.0.0.0 --port 2000 --no-warmup \
    -hf unsloth/Qwen3-Coder-Next-GGUF:UD-Q4_K_XL \
    --jinja --temp 1.0 --top-p 0.95 --min-p 0.01 --top-k 40 --fit on \
    --ctx-size 32768

danielhanchen · 2026-02-04T00:58:25 1770166705

Super cool! Also with `--fit on` you don't need `--ctx-size 32768` technically anymore - llama-server will auto determine the max context size!

genpfault · 2026-02-04T03:18:26 1770175106

Nifty, thanks for the heads-up!

halcyonblue · 2026-02-03T18:42:51 1770144171

What am I missing here? I thought this model needs 46GB of unified memory for 4-bit quant. Radeon RX 7900 XTX has 24GB of memory right? Hoping to get some insight, thanks in advance!

coder543 · 2026-02-03T18:52:42 1770144762

MoEs can be efficiently split between dense weights (attention/KV/etc) and sparse (MoE) weights. By running the dense weights on the GPU and offloading the sparse weights to slower CPU RAM, you can still get surprisingly decent performance out of a lot of MoEs.

Not as good as running the entire thing on the GPU, of course.

lnenad · 2026-02-04T15:10:30 1770217830

Thanks to you I decided to give it a go as well (didn't think I'd be able to run it on 7900xtx) and I must say it's awesome for a local model. More than capable for more straightforward stuff. It uses full VRAM and about 60GBs of RAM, but runs at about 10tok/s and is *very* usable.

genpfault · 2026-01-26T19:24:05 1769455445

> ConnectBot

https://f-droid.org/en/packages/org.connectbot/

https://github.com/connectbot/connectbot

genpfault · 2026-01-23T18:38:26 1769193506

Because TFA never bothered to define it:

Broadband Network Gateway (BNG)[1]

[1]: https://github.com/codelaboratoryltd/bng#bng-broadband-netwo...

bigwheels · 2026-01-23T19:01:20 1769194880

Thanks! "OLT" was also new to me. In case others find it helpful:

> OLT = Optical Line Terminal.

> In ISP fiber (typically GPON/EPON) infrastructure, it’s the provider-side device at the central office/headend that terminates and controls the passive optical network: it connects upstream into the ISP’s aggregation/core network and downstream via fiber (through splitters) to many customers’ ONTs/ONUs, handling PON line control, provisioning, QoS, and traffic aggregation.

joshbaptiste · 2026-01-23T19:36:22 1769196982

Thanks.. was reading the article like WTF is "BNG"

direwolf20 · 2026-01-23T18:40:06 1769193606

Is it the FTTX equivalent of a BRAS?

chaz6 · 2026-01-23T21:37:20 1769204240

Yes, exactly. BRAS is functionally the same as BNG.

chasd00 · 2026-01-24T01:13:52 1769217232

So what is BRAS?

wmf · 2026-01-24T01:48:24 1769219304

https://en.wikipedia.org/wiki/Broadband_remote_access_server

genpfault · 2026-01-08T23:50:34 1767916234

tap tap tap tap tap

flexagoon · 2026-01-09T00:48:26 1767919706

tap tap tap tap tap tap tap

cgio · 2026-01-09T07:02:46 1767942166

tap tap tap tap tap tap tap tap tap tap tap

genpfault · 2025-12-18T05:56:25 1766037385

Getting ~150 tok/s on an empty context with a 24 GB 7900XTX via llama.cpp's Vukan backend.

Tepix · 2025-12-20T09:29:57 1766222997

Again, you're using some 3rd party quantisations, not the weights supplied by Nvidia (which don't fit in 24GB).

genpfault · 2025-12-04T20:31:12 1764880272

It was right there[1] in the assembly video.

[1]: https://youtu.be/pcAEqbYwixU?t=1038

genpfault · 2025-11-04T19:43:05 1762285385

> Ctrl+S to save

XOFF ignored, mumble mumble

genpfault · 2025-10-30T23:43:43 1761867823

> triple-backtick code blocks

If only :(