Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You can run it today with that 12gb vram 3060, but I would suggest getting 2 3090s. Use cmoe option. This will keep the attention/route tensors on the GPU and offload the rest to system memory. Try it now and see the performance.
 help



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: