|
Fireworks DevDay 2025 Wrapped
|
-- |
2025-10-06 |
990 |
--
|
|
Why do all LLMs need structured output modes?
|
-- |
2025-10-06 |
2,806 |
--
|
|
New in Fireworks: Image-to-Image and ControlNet support for SSD-1B and SDXL!
|
-- |
2025-10-06 |
859 |
--
|
|
Announcing custom models and on-demand H100s with 50%+ lower costs and latency …
|
-- |
2025-10-06 |
1,121 |
--
|
|
Fireworks Real-World Benchmarks: Find the Best OSS Model for the Job
|
-- |
2025-10-06 |
765 |
--
|
|
Introducing OpenAI gpt-oss (20b & 120b)
|
-- |
2025-10-06 |
872 |
--
|
|
Quality first: how Fireworks.ai is the go-to place for gpt-oss
|
-- |
2025-10-06 |
1,094 |
--
|
|
Audio September Release - Streaming Transcription V2 and Streaming Speaker Diarization
|
-- |
2025-10-06 |
789 |
--
|
|
Partnering with Meta to bring Llama 3 to Firework’s inference and fine-tuning
|
-- |
2025-10-06 |
800 |
--
|
|
Document inlining: Crossing the modality gap with Compound AI
|
-- |
2025-10-06 |
1,685 |
--
|
|
20x faster Whisper than OpenAI - Fireworks audio transcribes 1 hour in …
|
-- |
2025-10-06 |
1,346 |
--
|
|
Agentic AI Systems
|
-- |
2025-10-06 |
1,946 |
--
|
|
Introducing Supervised Fine Tuning V2
|
-- |
2025-10-06 |
789 |
--
|
|
Understanding Function Calling: The Bridge to Agentic AI
|
-- |
2025-10-06 |
1,251 |
--
|
|
Fireworks.ai Achieves SOC 2 Type II and HIPAA Compliance
|
-- |
2025-10-06 |
416 |
--
|
|
Build customizable, real-time voice agents with Fireworks Voice Agent Platform (Beta)
|
-- |
2025-10-06 |
889 |
--
|
|
FireAttention — Serving Open Source Models 4x faster than vLLM by quantizing …
|
-- |
2025-10-06 |
1,336 |
--
|
|
Mixtral 8x7B on Fireworks: faster, cheaper, even before the official release
|
-- |
2025-10-06 |
844 |
--
|
|
VibeRL: When AI Trains AI
|
-- |
2025-10-06 |
749 |
--
|
|
How Cresta drives millions of real-time, AI-powered contact center interactions with Fireworks
|
-- |
2025-10-06 |
1,108 |
--
|
|
Fireworks Summer Audio Updates: Fastest Transcription now with Diarization and Batch API
|
-- |
2025-10-06 |
1,362 |
--
|
|
Multi-LoRA: Personalize AI at scale and deliver the best experience for each …
|
-- |
2025-10-06 |
1,350 |
--
|
|
Multi-Query Attention is All You Need
|
-- |
2025-10-06 |
3,781 |
--
|
|
Partnering with Meta: Bringing Llama 3.2 to Fireworks for Fine-Tuning and Inference
|
-- |
2025-10-06 |
1,777 |
--
|
|
Deep-dive into MuonClip: Fixing Attention Score Explosions in Transformer Training
|
-- |
2025-10-06 |
2,759 |
--
|
|
Deep-Dive into LLM Fine-Tuning
|
-- |
2025-10-06 |
1,987 |
--
|
|
Enabling Function Calling in DeepSeek v3: Bridging the Gap Between Text and …
|
-- |
2025-10-06 |
2,220 |
--
|
|
Simplifying Code Infilling with Code Llama and Fireworks.ai
|
-- |
2025-10-06 |
443 |
--
|
|
Faster, more efficient DeepSeek on the Fireworks AI Developer Cloud
|
-- |
2025-10-06 |
425 |
--
|
|
Fireworks AI Now Supports Amazon SageMaker
|
-- |
2025-10-06 |
488 |
--
|
|
Vision Model Platform Updates: Enhanced Capabilities and New Features
|
-- |
2025-10-06 |
1,174 |
--
|
|
FireAttention V4: Industry-Leading Latency and Cost Efficiency with FP4
|
-- |
2025-10-06 |
1,086 |
--
|
|
Fireworks Streaming Transcription: 300ms with Whisper-v3-large-quality
|
-- |
2025-10-06 |
1,119 |
--
|
|
Kimi K2: Deep Dive into model performance and use-cases
|
-- |
2025-10-06 |
1,051 |
--
|
|
DeepSeek V3 just got vision capabilities!
|
-- |
2025-10-06 |
525 |
--
|
|
Build Your Own Flight Recommendation System using FastAPI, SerpAPI, and Firefunction
|
-- |
2025-10-06 |
4,353 |
--
|
|
Introducing Llama 3.1 inference endpoints in partnership with Meta
|
-- |
2025-10-06 |
874 |
--
|
|
FireFunction V1 - Fireworks’ GPT-4-level function calling model - 4x faster than …
|
-- |
2025-10-06 |
1,647 |
--
|
|
FireOptimizer: Customizing latency and quality for your production inference workload
|
-- |
2025-10-06 |
1,736 |
--
|
|
Beyond Supervised Fine Tuning: How Reinforcement Learning Empowers AI with Minimal Labels
|
-- |
2025-10-06 |
1,970 |
--
|
|
Test-Driven Agent Development with Eval Protocol
|
-- |
2025-10-06 |
1,569 |
--
|
|
How Fireworks evaluates quantization precisely and interpretably
|
-- |
2025-10-06 |
2,301 |
--
|
|
Qwen 3 on Fireworks AI: Controllable Chain-of-Thought and Tool Calling at Frontier …
|
-- |
2025-10-06 |
869 |
--
|
|
Fireworks launches fine-tuning service - Rapidly iterate on quality and scale to …
|
-- |
2025-10-06 |
1,189 |
--
|
|
Traces Are All You Need (to rank LLMs)
|
-- |
2025-10-06 |
2,174 |
--
|
|
Optimizing Retrieval Augmented Generation (RAG) with MongoDB Atlas and Fireworks AI
|
-- |
2025-10-06 |
1,980 |
--
|
|
Firefunction-v2: Function calling capability on par with GPT4o at 2.5x the speed …
|
-- |
2025-10-06 |
1,737 |
--
|
|
Fireworks.ai Now Available on LangChain Prompt Playground
|
-- |
2025-10-06 |
821 |
--
|
|
Mistral Small 3 Now Available on Fireworks: Faster, Lighter, and More Efficient
|
-- |
2025-10-06 |
407 |
--
|
|
Fireworks Raises the Quality Bar with Function Calling Model and API Release
|
-- |
2025-10-06 |
2,257 |
--
|
|
Building a RAG with Astro, FastAPI, SurrealDB and Llama 3.1
|
-- |
2025-10-06 |
3,598 |
--
|
|
Launching Fireworks for Startups Program!
|
-- |
2025-10-06 |
495 |
--
|
|
Global Fast Food Group Transforms Drive-Thru with Real-Time Voice Intelligence with Fireworks
|
-- |
2025-10-06 |
1,019 |
--
|
|
Introducing FLUX.1 Kontext on Fireworks
|
-- |
2025-10-06 |
408 |
--
|
|
Fireworks Platform Spring 2024 Updates
|
-- |
2025-10-06 |
1,609 |
--
|
|
Fine-Tuning DeepSeek v3 & R1 to optimize quality, latency, & cost
|
-- |
2025-10-06 |
963 |
--
|
|
Building a High‑Quality Synthetic Data Pipeline for Supervised Fine‑Tuning
|
-- |
2025-10-06 |
996 |
--
|
|
Code Generation with Large Language Models - Fireworks AI Take
|
-- |
2025-10-06 |
1,561 |
--
|
|
DeepSeek R1 Just Got Eyes with Fireworks AI Document Inlining
|
-- |
2025-10-06 |
2,265 |
--
|
|
Three projects, one platform: A developer's winning streak with Fireworks AI
|
-- |
2025-10-06 |
1,600 |
--
|
|
DeepSeek v3 and R1 Model Architecture: Why it's powerful and economical
|
-- |
2025-10-06 |
1,761 |
--
|
|
Building an open-source Browser Agent on Fireworks AI
|
-- |
2025-10-06 |
2,718 |
--
|
|
FireLLaVA: the first commercially permissive OSS LLaVA model
|
-- |
2025-10-06 |
991 |
--
|
|
Your AI Benchmark is Lying to You. Here's How We Caught It
|
-- |
2025-10-06 |
4,163 |
--
|
|
Introducing Vision-Language Model Fine-tuning: Tailor VLMs to Your Domain
|
-- |
2025-10-06 |
938 |
--
|
|
Supervised Fine-Tuning (SFT) with LoRA on Fireworks AI: Tutorial
|
-- |
2025-10-06 |
1,034 |
--
|
|
Run bulk async workloads with Fireworks Batch API
|
-- |
2025-10-06 |
450 |
--
|
|
Distillation with Reasoning: Can DeepSeek R1 Teach Better Than Humans?
|
-- |
2025-10-06 |
905 |
--
|
|
Qwen3 Decoded: Choosing the Right Model For Your Task
|
-- |
2025-10-06 |
2,790 |
--
|
|
Build for Scale with Fireworks Virtual Cloud (GA)
|
-- |
2025-10-06 |
1,128 |
--
|
|
Building Enterprise-Scale RAG Systems with Fireworks AI and MongoDB Atlas
|
-- |
2025-10-06 |
1,745 |
--
|
|
Building AI agents with the Fireworks Experimentation Platform (GA) and Build SDK …
|
-- |
2025-10-06 |
1,301 |
--
|
|
FLUX.1 on Fireworks: Fast, frugal, and flexible
|
-- |
2025-10-06 |
1,137 |
--
|
|
LLM Eval Driven Development with Claude Code
|
-- |
2025-10-06 |
1,454 |
--
|
|
Unlock Your Tools: Fireworks Adds OpenAI-Response API with MCP Support (Beta)
|
-- |
2025-10-06 |
1,152 |
--
|
|
Fireworks.ai: Fast, Affordable, Customizable Gen AI Platform
|
-- |
2025-10-06 |
1,614 |
--
|
|
Understanding Embeddings and Reranking at Scale
|
-- |
2025-10-06 |
1,612 |
--
|
|
From text to task: Constrained generation for structured extraction in R1
|
-- |
2025-10-06 |
5,992 |
--
|
|
LLM Inference Performance Benchmarking (Part 1)
|
-- |
2025-10-06 |
747 |
--
|
|
Using Model-as-a-Judge for Reward in Reinforcement Fine Tuning
|
-- |
2025-10-06 |
824 |
--
|
|
GPUs on-demand: Not serverless, not reserved, but some third thing
|
-- |
2025-10-06 |
1,670 |
--
|
|
Announcing Eval Protocol
|
-- |
2025-10-06 |
829 |
--
|
|
How Upwork and Fireworks deliver faster, smarter proposals for freelancers
|
-- |
2025-10-06 |
1,026 |
--
|
|
Fireworks f1: A breakthrough in complex reasoning with Compound AI
|
-- |
2025-10-06 |
605 |
--
|
|
How Cursor built Fast Apply using the Speculative Decoding API
|
-- |
2025-10-06 |
1,052 |
--
|
|
Fireworks AI Raises $52M Series B to Lead Industry Shift to Compound …
|
-- |
2025-10-06 |
1,132 |
--
|
|
Speed, Python: Pick Two. How CUDA Graphs Enable Fast Python Code for …
|
-- |
2025-10-06 |
3,679 |
--
|
|
Production-Ready AI Agents with Optimized Inference with AWS AgentCore
|
-- |
2025-10-06 |
451 |
--
|
|
Doomed to Code: How we Teamed Up with Fireworks AI at MistralAI …
|
-- |
2025-10-06 |
3,100 |
--
|
|
Sentient & Fireworks Powers Decentralized AI At Viral Scale
|
-- |
2025-10-06 |
1,412 |
--
|
|
FireAttention V2: 12x faster to make Long Contexts practical for Online Inference
|
-- |
2025-10-06 |
891 |
--
|
|
Announcing Embeddings and Reranking On Fireworks AI
|
-- |
2025-10-15 |
899 |
--
|
|
Optimizing Llama 4 Maverick on Fireworks AI
|
-- |
2025-10-06 |
1,205 |
--
|
|
DeepSeek V3.1 now on Fireworks AI!
|
-- |
2025-10-06 |
653 |
--
|
|
How Enterprises are using Multimodal Models in production with Fireworks
|
-- |
2025-10-06 |
686 |
--
|
|
Fireworks AI Now Supports NVIDIA NIM Deployments for Blazing AI Inference
|
-- |
2025-10-06 |
884 |
--
|
|
3D FireOptimizer: Automating the Multi-Dimensional Tradeoffs in LLM Serving
|
-- |
2025-10-06 |
1,447 |
--
|
|
Accelerating Code Completion with Fireworks Fast LLM Inference
|
-- |
2025-10-06 |
639 |
--
|
|
Real-time, performant code assistance: How Sourcegraph scaled with Fireworks AI
|
-- |
2025-10-06 |
1,220 |
--
|
|
Reinforcement Fine Tuning (Beta): Train expert open models to surpass closed frontier …
|
-- |
2025-10-06 |
958 |
--
|
|
FireAttention V3: Enabling AMD as a viable alternative for GPU inference
|
-- |
2025-10-06 |
1,910 |
--
|
|
DeepSeek R1: All you need to know 🐳
|
-- |
2025-10-06 |
1,502 |
--
|
|
Getting Started with Stability’s API Powered by Fireworks
|
-- |
2025-10-06 |
1,040 |
--
|
|
How Notion Cuts Latency 4x and Scales Enterprise AI Workflows with Fireworks …
|
-- |
2025-10-06 |
584 |
--
|
|
LLM on the edge: Model picking with Fireworks Eval Protocol + Ollama
|
-- |
2025-10-18 |
912 |
--
|
|
Fireworks and AMD partner to power the next gen of AI infrastructure …
|
-- |
2025-10-20 |
415 |
--
|
|
Deployment Shapes: One-Click Deployment Configured For You
|
-- |
2025-10-24 |
875 |
--
|
|
We raised $250M To Help Enterprises Own Their AI
|
-- |
2025-10-28 |
818 |
--
|
|
Accelerate your Vision Pipelines with the new NVIDIA Nemotron Nano 2 VL …
|
-- |
2025-10-27 |
831 |
--
|
|
Genspark’s Deep Research Agent Outperforms a Frontier Closed Model in Quality and …
|
-- |
2025-11-01 |
1,126 |
--
|
|
40X Faster, and Smarter Outputs: How Vercel Turbocharged their Code Fixing Model …
|
-- |
2025-10-31 |
1,086 |
--
|
|
Fireworks RFT: Build AI agents with fine-tuned open models that outperform frontier …
|
-- |
2025-11-11 |
1,046 |
--
|
|
Modernizing Healthcare with AI: How RADPAIR and Fireworks Unlock Smarter Radiology Workflows
|
-- |
2025-11-09 |
2,408 |
--
|
|
50 Trillion Tokens Per Day: The State of Agent Environments
|
-- |
2025-11-19 |
2,411 |
--
|
|
Fireworks Achieves Triple ISO Certification, giving Enterprises Full Control and Trust in …
|
-- |
2025-11-20 |
771 |
--
|
|
Eval Protocol: RL on your agents, in any environment
|
-- |
2025-11-21 |
1,400 |
--
|
|
Fireworks Expands AWS Alliance: Strategic Collaboration Agreement + GenAI Competency
|
-- |
2025-11-25 |
646 |
--
|
|
Unlock Advanced Reasoning with NVIDIA Nemotron Nano 2 Models on Fireworks AI
|
-- |
2025-11-24 |
1,343 |
--
|
|
Turn Your LLM into a Classifier for $2
|
-- |
2025-12-05 |
2,409 |
--
|
|
Best Practices for Multi-Turn RL
|
-- |
2025-12-12 |
2,797 |
--
|
|
NVIDIA Nemotron 3 Nano on Fireworks: The Engine for Next-Generation AI Agents
|
-- |
2025-12-16 |
843 |
--
|
|
Self-Improving Agents, Powered by Your Evals
|
-- |
2025-12-17 |
1,389 |
--
|
|
DPO, your simplest RL pipeline with two rollouts
|
-- |
2026-01-07 |
3,103 |
--
|