|
Fireworks DevDay 2025 Wrapped
|
-- |
2025-05-29 |
963 |
--
|
|
Why do all LLMs need structured output modes?
|
-- |
2024-02-20 |
2,766 |
--
|
|
Announcing custom models and on-demand H100s with 50%+ lower costs and latency …
|
-- |
2024-06-03 |
1,072 |
--
|
|
Fireworks Real-World Benchmarks: Find the Best OSS Model for the Job
|
-- |
2025-07-30 |
681 |
--
|
|
Introducing OpenAI gpt-oss (20b & 120b)
|
-- |
2025-08-05 |
804 |
--
|
|
Quality first: how Fireworks.ai is the go-to place for gpt-oss
|
-- |
2025-08-12 |
1,030 |
--
|
|
Audio September Release - Streaming Transcription V2 and Streaming Speaker Diarization
|
-- |
2025-10-06 |
789 |
--
|
|
Partnering with Meta to bring Llama 3 to Firework’s inference and fine-tuning
|
-- |
2024-04-18 |
729 |
--
|
|
Document inlining: Crossing the modality gap with Compound AI
|
-- |
2025-10-06 |
1,685 |
--
|
|
20x faster Whisper than OpenAI - Fireworks audio transcribes 1 hour in …
|
-- |
2024-12-09 |
1,307 |
--
|
|
Agentic AI Systems
|
-- |
2025-05-19 |
1,900 |
--
|
|
Introducing Supervised Fine Tuning V2
|
-- |
2025-06-13 |
735 |
--
|
|
Understanding Function Calling: The Bridge to Agentic AI
|
-- |
2025-07-11 |
1,203 |
--
|
|
Build customizable, real-time voice agents with Fireworks Voice Agent Platform (Beta)
|
-- |
2025-10-06 |
889 |
--
|
|
FireAttention — Serving Open Source Models 4x faster than vLLM by quantizing …
|
-- |
2024-01-08 |
1,278 |
--
|
|
VibeRL: When AI Trains AI
|
-- |
2025-07-22 |
697 |
--
|
|
How Cresta drives millions of real-time, AI-powered contact center interactions with Fireworks
|
-- |
2024-12-08 |
1,085 |
--
|
|
Fireworks Summer Audio Updates: Fastest Transcription now with Diarization and Batch API
|
-- |
2025-10-06 |
1,362 |
--
|
|
Multi-LoRA: Personalize AI at scale and deliver the best experience for each …
|
-- |
2024-09-18 |
1,201 |
--
|
|
Partnering with Meta: Bringing Llama 3.2 to Fireworks for Fine-Tuning and Inference
|
-- |
2024-09-25 |
1,689 |
--
|
|
Deep-dive into MuonClip: Fixing Attention Score Explosions in Transformer Training
|
-- |
2025-07-15 |
2,699 |
--
|
|
Deep-Dive into LLM Fine-Tuning
|
-- |
2025-10-06 |
1,976 |
--
|
|
Enabling Function Calling in DeepSeek v3: Bridging the Gap Between Text and …
|
-- |
2025-02-14 |
2,159 |
--
|
|
Faster, more efficient DeepSeek on the Fireworks AI Developer Cloud
|
-- |
2025-03-18 |
386 |
--
|
|
Fireworks AI Now Supports Amazon SageMaker
|
-- |
2025-07-15 |
448 |
--
|
|
Vision Model Platform Updates: Enhanced Capabilities and New Features
|
-- |
2025-06-12 |
1,133 |
--
|
|
FireAttention V4: Industry-Leading Latency and Cost Efficiency with FP4
|
-- |
2025-05-28 |
1,011 |
--
|
|
Fireworks Streaming Transcription: 300ms with Whisper-v3-large-quality
|
-- |
2025-10-06 |
1,119 |
--
|
|
Kimi K2: Deep Dive into model performance and use-cases
|
-- |
2025-08-01 |
989 |
--
|
|
DeepSeek V3 just got vision capabilities!
|
-- |
2024-12-18 |
471 |
--
|
|
Build Your Own Flight Recommendation System using FastAPI, SerpAPI, and Firefunction
|
-- |
2024-08-29 |
4,283 |
--
|
|
Introducing Llama 3.1 inference endpoints in partnership with Meta
|
-- |
2024-07-23 |
805 |
--
|
|
FireFunction V1 - Fireworks’ GPT-4-level function calling model - 4x faster than …
|
-- |
2024-02-20 |
1,598 |
--
|
|
FireOptimizer: Customizing latency and quality for your production inference workload
|
-- |
2024-08-30 |
1,685 |
--
|
|
Beyond Supervised Fine Tuning: How Reinforcement Learning Empowers AI with Minimal Labels
|
-- |
2025-01-27 |
1,905 |
--
|
|
Test-Driven Agent Development with Eval Protocol
|
-- |
2025-08-14 |
1,501 |
--
|
|
How Fireworks evaluates quantization precisely and interpretably
|
-- |
2024-08-01 |
2,277 |
--
|
|
Qwen 3 on Fireworks AI: Controllable Chain-of-Thought and Tool Calling at Frontier …
|
-- |
2025-05-06 |
815 |
--
|
|
Fireworks launches fine-tuning service - Rapidly iterate on quality and scale to …
|
-- |
2024-03-08 |
1,138 |
--
|
|
Traces Are All You Need (to rank LLMs)
|
-- |
2025-09-22 |
2,091 |
--
|
|
Optimizing Retrieval Augmented Generation (RAG) with MongoDB Atlas and Fireworks AI
|
-- |
2024-03-21 |
1,904 |
--
|
|
Firefunction-v2: Function calling capability on par with GPT4o at 2.5x the speed …
|
-- |
2024-06-17 |
1,684 |
--
|
|
Mistral Small 3 Now Available on Fireworks: Faster, Lighter, and More Efficient
|
-- |
2025-01-30 |
347 |
--
|
|
Building a RAG with Astro, FastAPI, SurrealDB and Llama 3.1
|
-- |
2024-08-14 |
3,514 |
--
|
|
Launching Fireworks for Startups Program!
|
-- |
2025-10-01 |
473 |
--
|
|
Global Fast Food Group Transforms Drive-Thru with Real-Time Voice Intelligence with Fireworks
|
-- |
2025-10-06 |
1,019 |
--
|
|
Introducing FLUX.1 Kontext on Fireworks
|
-- |
2025-07-09 |
372 |
--
|
|
Fireworks Platform Spring 2024 Updates
|
-- |
2024-03-01 |
1,572 |
--
|
|
Fine-Tuning DeepSeek v3 & R1 to optimize quality, latency, & cost
|
-- |
2025-03-12 |
890 |
--
|
|
Building a High‑Quality Synthetic Data Pipeline for Supervised Fine‑Tuning
|
-- |
2025-06-04 |
972 |
--
|
|
Code Generation with Large Language Models - Fireworks AI Take
|
-- |
2024-05-08 |
1,466 |
--
|
|
DeepSeek R1 Just Got Eyes with Fireworks AI Document Inlining
|
-- |
2025-02-05 |
2,194 |
--
|
|
Three projects, one platform: A developer's winning streak with Fireworks AI
|
-- |
2024-10-14 |
1,561 |
--
|
|
DeepSeek v3 and R1 Model Architecture: Why it's powerful and economical
|
-- |
2025-02-07 |
1,663 |
--
|
|
Building an open-source Browser Agent on Fireworks AI
|
-- |
2025-05-21 |
2,613 |
--
|
|
FireLLaVA: the first commercially permissive OSS LLaVA model
|
-- |
2024-01-18 |
933 |
--
|
|
Your AI Benchmark is Lying to You. Here's How We Caught It
|
-- |
2025-08-15 |
4,108 |
--
|
|
Introducing Vision-Language Model Fine-tuning: Tailor VLMs to Your Domain
|
-- |
2025-07-29 |
885 |
--
|
|
Supervised Fine-Tuning (SFT) with LoRA on Fireworks AI: Tutorial
|
-- |
2025-05-12 |
1,037 |
--
|
|
Run bulk async workloads with Fireworks Batch API
|
-- |
2025-07-31 |
419 |
--
|
|
Distillation with Reasoning: Can DeepSeek R1 Teach Better Than Humans?
|
-- |
2025-01-31 |
853 |
--
|
|
Qwen3 Decoded: Choosing the Right Model For Your Task
|
-- |
2025-08-01 |
2,790 |
--
|
|
Build for Scale with Fireworks Virtual Cloud (GA)
|
-- |
2025-06-16 |
1,088 |
--
|
|
Building Enterprise-Scale RAG Systems with Fireworks AI and MongoDB Atlas
|
-- |
2025-04-09 |
1,702 |
--
|
|
Building AI agents with the Fireworks Experimentation Platform (GA) and Build SDK …
|
-- |
2025-06-11 |
1,241 |
--
|
|
FLUX.1 on Fireworks: Fast, frugal, and flexible
|
-- |
2024-10-22 |
1,107 |
--
|
|
LLM Eval Driven Development with Claude Code
|
-- |
2025-08-25 |
1,394 |
--
|
|
Unlock Your Tools: Fireworks Adds OpenAI-Response API with MCP Support (Beta)
|
-- |
2025-06-22 |
1,088 |
--
|
|
Understanding Embeddings and Reranking at Scale
|
-- |
2025-09-12 |
1,546 |
--
|
|
From text to task: Constrained generation for structured extraction in R1
|
-- |
2025-02-01 |
5,968 |
--
|
|
Using Model-as-a-Judge for Reward in Reinforcement Fine Tuning
|
-- |
2025-07-10 |
765 |
--
|
|
GPUs on-demand: Not serverless, not reserved, but some third thing
|
-- |
2024-06-03 |
1,648 |
--
|
|
Announcing Eval Protocol
|
-- |
2025-08-04 |
783 |
--
|
|
How Upwork and Fireworks deliver faster, smarter proposals for freelancers
|
-- |
2024-11-11 |
990 |
--
|
|
Fireworks f1: A breakthrough in complex reasoning with Compound AI
|
-- |
2024-11-15 |
535 |
--
|
|
How Cursor built Fast Apply using the Speculative Decoding API
|
-- |
2024-06-23 |
997 |
--
|
|
Fireworks AI Raises $52M Series B to Lead Industry Shift to Compound …
|
-- |
2024-07-11 |
1,070 |
--
|
|
Production-Ready AI Agents with Optimized Inference with AWS AgentCore
|
-- |
2025-10-02 |
401 |
--
|
|
Doomed to Code: How we Teamed Up with Fireworks AI at MistralAI …
|
-- |
2024-05-06 |
3,024 |
--
|
|
Sentient & Fireworks Powers Decentralized AI At Viral Scale
|
-- |
2025-07-17 |
1,333 |
--
|
|
FireAttention V2: 12x faster to make Long Contexts practical for Online Inference
|
-- |
2024-06-20 |
848 |
--
|
|
Announcing Embeddings and Reranking On Fireworks AI
|
-- |
2025-10-09 |
870 |
--
|
|
Optimizing Llama 4 Maverick on Fireworks AI
|
-- |
2025-04-28 |
1,151 |
--
|
|
DeepSeek V3.1 now on Fireworks AI!
|
-- |
2025-08-26 |
613 |
--
|
|
How Enterprises are using Multimodal Models in production with Fireworks
|
-- |
2024-09-25 |
596 |
--
|
|
Fireworks AI Now Supports NVIDIA NIM Deployments for Blazing AI Inference
|
-- |
2025-03-18 |
851 |
--
|
|
3D FireOptimizer: Automating the Multi-Dimensional Tradeoffs in LLM Serving
|
-- |
2025-06-14 |
1,385 |
--
|
|
Real-time, performant code assistance: How Sourcegraph scaled with Fireworks AI
|
-- |
2025-01-22 |
1,214 |
--
|
|
Reinforcement Fine Tuning (Beta): Train expert open models to surpass closed frontier …
|
-- |
2025-06-09 |
885 |
--
|
|
FireAttention V3: Enabling AMD as a viable alternative for GPU inference
|
-- |
2024-10-15 |
1,856 |
--
|
|
DeepSeek R1: All you need to know 🐳
|
-- |
2025-01-24 |
1,431 |
--
|
|
Getting Started with Stability’s API Powered by Fireworks
|
-- |
2024-04-17 |
987 |
--
|
|
How Notion Cuts Latency 4x and Scales Enterprise AI Workflows with Fireworks …
|
-- |
2025-07-25 |
514 |
--
|
|
LLM on the edge: Model picking with Fireworks Eval Protocol + Ollama
|
-- |
2025-10-15 |
852 |
--
|
|
Fireworks and AMD partner to power the next gen of AI infrastructure …
|
-- |
2025-10-20 |
341 |
--
|
|
Deployment Shapes: One-Click Deployment Configured For You
|
-- |
2025-10-23 |
828 |
--
|
|
We raised $250M To Help Enterprises Own Their AI
|
-- |
2025-10-28 |
788 |
--
|
|
Accelerate your Vision Pipelines with the new NVIDIA Nemotron Nano 2 VL …
|
-- |
2025-10-27 |
793 |
--
|
|
Genspark’s Deep Research Agent Outperforms a Frontier Closed Model in Quality and …
|
-- |
2025-10-31 |
1,063 |
--
|
|
40X Faster, and Smarter Outputs: How Vercel Turbocharged their Code Fixing Model …
|
-- |
2025-11-03 |
1,025 |
--
|
|
Fireworks RFT: Build AI agents with fine-tuned open models that outperform frontier …
|
-- |
2025-11-10 |
1,023 |
--
|
|
Modernizing Healthcare with AI: How RADPAIR and Fireworks Unlock Smarter Radiology Workflows
|
-- |
2025-11-09 |
2,365 |
--
|
|
50 Trillion Tokens Per Day: The State of Agent Environments
|
-- |
2025-11-19 |
2,333 |
--
|
|
Fireworks Achieves Triple ISO Certification, giving Enterprises Full Control and Trust in …
|
-- |
2025-11-19 |
739 |
--
|
|
Eval Protocol: RL on your agents, in any environment
|
-- |
2025-11-20 |
1,316 |
--
|
|
Fireworks Expands AWS Alliance: Strategic Collaboration Agreement + GenAI Competency
|
-- |
2025-11-24 |
593 |
--
|
|
Unlock Advanced Reasoning with NVIDIA Nemotron Nano 2 Models on Fireworks AI
|
-- |
2025-12-02 |
1,294 |
--
|
|
Turn Your LLM into a Calibrated Classifier for $2
|
-- |
2025-12-04 |
2,523 |
--
|
|
Best Practices for Multi-Turn RL
|
-- |
2025-12-10 |
2,796 |
--
|
|
NVIDIA Nemotron 3 Nano on Fireworks: The Engine for Next-Generation AI Agents
|
-- |
2025-12-15 |
787 |
--
|
|
Self-Improving Agents, Powered by Your Evals
|
-- |
2025-12-17 |
1,339 |
--
|
|
DPO, your simplest RL pipeline with two rollouts
|
-- |
2025-12-31 |
3,074 |
--
|
|
A Deep Dive into MLA training/inference difference and why QK-Clip from Kimi …
|
-- |
2025-07-22 |
5,418 |
--
|
|
Turning Production Logs into Evaluation Datasets: A Data-Driven Approach
|
-- |
2026-01-23 |
1,274 |
--
|
|
Kimi K2.5 is Live on Fireworks: Vibe Coding, Agents, and Full-Parameter RFT
|
-- |
2026-01-26 |
790 |
--
|
|
Build powerful agents on OSS models with Blazing Fast Inference on Fireworks
|
-- |
2026-01-27 |
379 |
--
|
|
The Missing Piece of the OpenClaw Mania: Truly ‘Own Your AI’ with …
|
-- |
2026-01-30 |
964 |
--
|
|
The Benchmark Gap: What It Takes to Ship Kimi K2.5
|
-- |
2026-02-03 |
2,040 |
--
|
|
Training-Inference Parity in MoE Models: Where Numerics Drift
|
-- |
2026-03-10 |
2,902 |
--
|
|
Fireworks Acquires Hathora to Accelerate Global Compute Orchestration
|
-- |
2026-03-10 |
458 |
--
|
|
Introducing Fireworks on Microsoft Foundry: Bringing Best-in-Class Open Model inference to Azure
|
-- |
2026-03-08 |
741 |
--
|
|
Best Open Source LLMs in 2026: We Reviewed 7 Models
|
-- |
2026-01-13 |
5,177 |
--
|
|
Why Building Mega Clusters Is Wrong
|
-- |
2026-03-10 |
2,382 |
--
|
|
Frontier RL Is Cheaper Than You Think
|
-- |
2026-03-23 |
2,138 |
--
|
|
The Fine-Tuning Bottleneck Isn't the Algorithm
|
-- |
2026-03-28 |
1,800 |
--
|
|
Scaling and Optimizing Frontier Model Training
|
-- |
2026-04-03 |
2,555 |
--
|
|
[staged] Introducing The Inference Fabric: Own Your AI
|
-- |
2026-04-06 |
2,309 |
--
|
|
Own Your AI: Fireworks Training Preview
|
-- |
2026-04-06 |
1,291 |
--
|
|
The DeepSeek Model Lineup: V3.2, R1, and Distilled Variants Mapped to Production …
|
-- |
2026-02-27 |
2,543 |
--
|
|
How We Protect from Prompt Injection on Fireworks AI
|
-- |
2026-04-03 |
2,138 |
--
|
|
How we fixed prompt injection for all models on Fireworks
|
-- |
2026-04-24 |
2,088 |
--
|
|
Notes on DeepSeek-V4's training system
|
-- |
2026-04-24 |
2,332 |
--
|
|
DeepSeek V4 Pro: Validating Frontier Models For Production
|
-- |
2026-04-27 |
1,272 |
--
|
|
Best LLMs for coding in 2026
|
-- |
2026-03-02 |
9,636 |
--
|
|
Innovative Solutions Rebuilds Enterprise Services Delivery with Fireworks AI
|
-- |
2026-05-05 |
1,403 |
--
|
|
Agents Don't Fail on Intelligence. They Fail on Execution.
|
-- |
2026-05-20 |
5,118 |
--
|
|
The Best 8 LLM API Providers in 2026
|
-- |
2026-03-04 |
10,131 |
--
|
|
Serverless 2.0: Three Ways to Run Inference, One API
|
-- |
2026-05-26 |
1,728 |
--
|
|
Trilogy Validates Open-Weight AI Models for Enterprise AI Workloads with Fireworks
|
-- |
2026-06-01 |
1,206 |
--
|
|
Open-source agents with frontier advisors: matching frontier performance through training and harness …
|
-- |
2026-06-03 |
2,368 |
--
|
|
NVIDIA Nemotron 3 Ultra is live on Fireworks, day zero
|
-- |
2026-06-04 |
531 |
--
|
|
Inference Providers vs. API Routers: where do tokens come from?
|
-- |
2026-03-06 |
1,575 |
--
|
|
MiniMax M3 is live: long context + native multimodality at 1/20th the …
|
-- |
2026-06-12 |
1,160 |
--
|
|
Qwen 3.7 Plus on Fireworks: Run it today.
|
-- |
2026-06-12 |
1,189 |
--
|
|
Kimi K2.7 Code on Fireworks: Better Agents, Lower Cost per Task, Available …
|
-- |
2026-06-12 |
821 |
--
|
|
GLM 5.2 is live on Fireworks inference, day zero.
|
-- |
2026-06-16 |
1,112 |
--
|
|
Fireworks is moving to prepaid billing on July 1st
|
-- |
2026-06-18 |
608 |
--
|