| HN Points | HN Title (Links to original post) | Submitted Date |
|---|---|---|
| 113 | A guide to open-source LLM inference and performance | 2023-11-20 |
| 51 | How we got Stable Diffusion XL inference to under 2 seconds | 2023-08-31 |
| 402 | Show HN: ChatLLaMA – A ChatGPT style chatbot for Facebook's LLaMA | 2023-03-22 |
| 247 | Running GPT-OSS-120B at 500 tokens per second on Nvidia GPUs | 2025-08-07 |