| 4 |
How are people training this LLMs? Dont they need lot of money? |
2024-01-19 |
| 53 |
Fireworks: Function Calling Model and API |
2023-12-21 |
| 3 |
Serving Open Source Models 4x faster than vLLM by quantizing with ~no tradeoffs |
2024-01-10 |
| 3 |
FireAttention – Serving Mixtral and open-source MoE models at 4x speed vs. vLLM |
2024-01-09 |
| 3 |
Multi-Query Attention Is All You Need |
2023-07-13 |
| 2 |
Accelerating Code Completion with Fireworks Fast LLM Inference |
2023-10-11 |
| 1 |
Fireworks.ai: Language Model Serving with Custom LoRA Fine-Tuned Models |
2023-08-18 |
| 20 |
FireAttention V3: Enabling AMD as a Viable Alternative for GPU Inference |
2024-10-17 |
| 17 |
Fireworks F1: A Breakthrough in Complex Reasoning with Compound AI |
2024-11-18 |
| 7 |
FireFunction V1 – GPT-4-level function calling model – 4x faster, open weights |
2024-02-22 |
| 2 |
How Fireworks evaluates quantization precisely and interpretably |
2024-08-03 |
| 1 |
Can DeepSeek R1 Teach Better Than Humans? |
2025-02-05 |
| 1 |
Document Inlining: Crossing the Modality Gap with Compound AI |
2024-12-23 |
| 1 |
GPUs on-demand: Not serverless, not reserved, but some third thing |
2024-06-07 |
| 4 |
LLM Eval Driven Development with Claude Code |
2025-08-28 |
| 1 |
Natural Language → SQL with Reinforcement Fine Tuning (RFT) |
2025-08-18 |