| 4 |
How are people training this LLMs? Dont they need lot of money? |
2024-01-19 |
| 3 |
Serving Open Source Models 4x faster than vLLM by quantizing with ~no tradeoffs |
2024-01-10 |
| 3 |
FireAttention – Serving Mixtral and open-source MoE models at 4x speed vs. vLLM |
2024-01-09 |
| 20 |
FireAttention V3: Enabling AMD as a Viable Alternative for GPU Inference |
2024-10-17 |
| 17 |
Fireworks F1: A Breakthrough in Complex Reasoning with Compound AI |
2024-11-18 |
| 7 |
FireFunction V1 – GPT-4-level function calling model – 4x faster, open weights |
2024-02-22 |
| 2 |
How Fireworks evaluates quantization precisely and interpretably |
2024-08-03 |
| 1 |
Can DeepSeek R1 Teach Better Than Humans? |
2025-02-05 |
| 1 |
Document Inlining: Crossing the Modality Gap with Compound AI |
2024-12-23 |
| 1 |
GPUs on-demand: Not serverless, not reserved, but some third thing |
2024-06-07 |
| 4 |
LLM Eval Driven Development with Claude Code |
2025-08-28 |
| 1 |
Natural Language → SQL with Reinforcement Fine Tuning (RFT) |
2025-08-18 |