Hacker Newsnew | past | comments | ask | show | jobs | submit | fromlogin
Modular Acquires BentoML (bentoml.com)
2 points by djhu9 2 days ago | past | discuss
The Best Open-Source Small Language Models (bentoml.com)
2 points by zyh888 57 days ago | past
Three Levels of Running LLMs from Laptop to Cluster-Scale Distributed Inference (bentoml.com)
3 points by bbzjk7 72 days ago | past
Where to Buy or Rent GPUs for LLM Inference: The 2026 GPU Procurement Guide (bentoml.com)
3 points by sherlockxu 3 months ago | past
ChatGPT Usage Limits: What They Are and How to Get Rid of Them (bentoml.com)
1 point by bbzjk7 3 months ago | past
LLM Benchmark and Optimization Explorer (bentoml.com)
1 point by tanelpoder 5 months ago | past
AMD Data Center GPUs Explained: MI250X, MI300X, MI350X and Beyond (bentoml.com)
1 point by djhu9 5 months ago | past
Nvidia Data Center GPUs Explained: From A100 to B200 and Beyond (bentoml.com)
4 points by bbzjk7 5 months ago | past
Benchmarks Show Speculative Decoding Needs the Right Draft Model for 3× Gains (bentoml.com)
1 point by bbzjk7 6 months ago | past
LLM Inference Handbook (bentoml.com)
366 points by djhu9 7 months ago | past | 26 comments
What Is InferenceOps (bentoml.com)
2 points by sherlockxu 7 months ago | past
The Shift to Distributed LLM Inference (bentoml.com)
4 points by djhu9 8 months ago | past
How to Beat the GPU CAP theorem in AI Inference (bentoml.com)
3 points by sherlockxu 9 months ago | past
Cold-Starting LLMs on Kubernetes in Under 30 Seconds (bentoml.com)
2 points by djhu9 10 months ago | past
Six Infrastructure Pitfalls Slowing Down Your AI Progress (bentoml.com)
2 points by djhu9 11 months ago | past
The Complete Guide to DeepSeek Models: From V3 to R1 and Beyond (bentoml.com)
2 points by bbzjk7 11 months ago | past
Survey shows 65% of organizations are still establishing their AI foundations (bentoml.com)
1 point by sherlockxu 11 months ago | past | 1 comment
2024 State of AI Inference Infrastructure Survey Results (bentoml.com)
2 points by bbzjk7 11 months ago | past
Secure and Private DeepSeek Deployment (bentoml.com)
1 point by djhu9 12 months ago | past
A Guide to ComfyUI Custom Nodes (bentoml.com)
1 point by bbzjk7 on Jan 2, 2025 | past
A List of Top Open-Source Embedding Models (bentoml.com)
5 points by bbzjk7 on Oct 30, 2024 | past | 1 comment
Top Open-Source Vision Language Models (bentoml.com)
1 point by sherlockxu on Oct 11, 2024 | past
Exploring the World of Open-Source Text-to-Speech Models (bentoml.com)
2 points by sherlockxu on Sept 20, 2024 | past
Tuning TensorRT-LLM for Optimal Serving (bentoml.com)
1 point by djhu9 on Sept 20, 2024 | past
Compound AI Systems (bentoml.com)
1 point by AnhTho_FR on Aug 24, 2024 | past
Why should you care about compound AI? (bentoml.com)
1 point by bbzjk7 on Aug 16, 2024 | past
From Ollama to OpenLLM: Running LLMs in the Cloud (bentoml.com)
3 points by sherlockxu on July 18, 2024 | past
Benchmarking LLM Inference Back Ends: VLLM, LMDeploy, MLC-LLM, TensorRT-LLM, TGI (bentoml.com)
15 points by chaoyu on July 5, 2024 | past | 1 comment
Is LMDeploy the Ultimate Solution? Why It Outshines VLLM, TRT-LLM, TGI, and MLC (bentoml.com)
16 points by helloericsf on June 20, 2024 | past | 8 comments
Stable Diffusion 3: Text Master, Prone Problems? (bentoml.com)
3 points by sherlockxu on June 18, 2024 | past

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: