116 blog posts published by month since the start of 2022. Start from a different year:

Posts year-to-date
116 (0 posts by this month last year.)
Average posts per month since 2022
0.0

Post details (2022 to today)

Title Author Date Word count HN points
Fireworks DevDay 2025 Wrapped - Oct 06, 2025 990 -
Why do all LLMs need structured output modes? - Oct 06, 2025 2806 -
New in Fireworks: Image-to-Image and ControlNet support for SSD-1B and SDXL! - Oct 06, 2025 859 -
Announcing custom models and on-demand H100s with 50%+ lower costs and latency than vLLM - Oct 06, 2025 1121 -
Fireworks Real-World Benchmarks: Find the Best OSS Model for the Job - Oct 06, 2025 765 -
Introducing OpenAI gpt-oss (20b & 120b) - Oct 06, 2025 872 -
Quality first: how Fireworks.ai is the go-to place for gpt-oss - Oct 06, 2025 1094 -
Audio September Release - Streaming Transcription V2 and Streaming Speaker Diarization - Oct 06, 2025 789 -
Partnering with Meta to bring Llama 3 to Firework’s inference and fine-tuning - Oct 06, 2025 800 -
Document inlining: Crossing the modality gap with Compound AI - Oct 06, 2025 1685 -
20x faster Whisper than OpenAI - Fireworks audio transcribes 1 hour in 4 seconds - Oct 06, 2025 1346 -
Agentic AI Systems - Oct 06, 2025 1946 -
Introducing Supervised Fine Tuning V2 - Oct 06, 2025 789 -
Understanding Function Calling: The Bridge to Agentic AI - Oct 06, 2025 1251 -
Fireworks.ai Achieves SOC 2 Type II and HIPAA Compliance - Oct 06, 2025 416 -
Build customizable, real-time voice agents with Fireworks Voice Agent Platform (Beta) - Oct 06, 2025 889 -
FireAttention — Serving Open Source Models 4x faster than vLLM by quantizing with ~no tradeoffs - Oct 06, 2025 1336 -
Mixtral 8x7B on Fireworks: faster, cheaper, even before the official release - Oct 06, 2025 844 -
VibeRL: When AI Trains AI - Oct 06, 2025 749 -
How Cresta drives millions of real-time, AI-powered contact center interactions with Fireworks - Oct 06, 2025 1108 -
Fireworks Summer Audio Updates: Fastest Transcription now with Diarization and Batch API - Oct 06, 2025 1362 -
Multi-LoRA: Personalize AI at scale and deliver the best experience for each customer and use case, with 100x cost-efficiency - Oct 06, 2025 1350 -
Multi-Query Attention is All You Need - Oct 06, 2025 3781 -
Partnering with Meta: Bringing Llama 3.2 to Fireworks for Fine-Tuning and Inference - Oct 06, 2025 1777 -
Deep-dive into MuonClip: Fixing Attention Score Explosions in Transformer Training - Oct 06, 2025 2759 -
Deep-Dive into LLM Fine-Tuning - Oct 06, 2025 1987 -
Enabling Function Calling in DeepSeek v3: Bridging the Gap Between Text and Action - Oct 06, 2025 2220 -
Simplifying Code Infilling with Code Llama and Fireworks.ai - Oct 06, 2025 443 -
Faster, more efficient DeepSeek on the Fireworks AI Developer Cloud - Oct 06, 2025 425 -
Fireworks AI Now Supports Amazon SageMaker - Oct 06, 2025 488 -
Vision Model Platform Updates: Enhanced Capabilities and New Features - Oct 06, 2025 1174 -
FireAttention V4: Industry-Leading Latency and Cost Efficiency with FP4 - Oct 06, 2025 1086 -
Fireworks Streaming Transcription: 300ms with Whisper-v3-large-quality - Oct 06, 2025 1119 -
Kimi K2: Deep Dive into model performance and use-cases - Oct 06, 2025 1051 -
DeepSeek V3 just got vision capabilities! - Oct 06, 2025 525 -
Build Your Own Flight Recommendation System using FastAPI, SerpAPI, and Firefunction - Oct 06, 2025 4353 -
Introducing Llama 3.1 inference endpoints in partnership with Meta - Oct 06, 2025 874 -
FireFunction V1 - Fireworks’ GPT-4-level function calling model - 4x faster than GPT-4 and open weights - Oct 06, 2025 1647 -
FireOptimizer: Customizing latency and quality for your production inference workload - Oct 06, 2025 1736 -
Beyond Supervised Fine Tuning: How Reinforcement Learning Empowers AI with Minimal Labels - Oct 06, 2025 1970 -
Test-Driven Agent Development with Eval Protocol - Oct 06, 2025 1569 -
How Fireworks evaluates quantization precisely and interpretably - Oct 06, 2025 2301 -
Qwen 3 on Fireworks AI: Controllable Chain-of-Thought and Tool Calling at Frontier Scale - Oct 06, 2025 869 -
Fireworks launches fine-tuning service - Rapidly iterate on quality and scale to production through Fireworks inference - Oct 06, 2025 1189 -
Traces Are All You Need (to rank LLMs) - Oct 06, 2025 2174 -
Optimizing Retrieval Augmented Generation (RAG) with MongoDB Atlas and Fireworks AI - Oct 06, 2025 1980 -
Firefunction-v2: Function calling capability on par with GPT4o at 2.5x the speed and 10% of the cost= - Oct 06, 2025 1737 -
Fireworks.ai Now Available on LangChain Prompt Playground - Oct 06, 2025 821 -
Mistral Small 3 Now Available on Fireworks: Faster, Lighter, and More Efficient - Oct 06, 2025 407 -
Fireworks Raises the Quality Bar with Function Calling Model and API Release - Oct 06, 2025 2257 -
Building a RAG with Astro, FastAPI, SurrealDB and Llama 3.1 - Oct 06, 2025 3598 -
Launching Fireworks for Startups Program! - Oct 06, 2025 495 -
Global Fast Food Group Transforms Drive-Thru with Real-Time Voice Intelligence with Fireworks - Oct 06, 2025 1019 -
Introducing FLUX.1 Kontext on Fireworks - Oct 06, 2025 408 -
Fireworks Platform Spring 2024 Updates - Oct 06, 2025 1609 -
Fine-Tuning DeepSeek v3 & R1 to optimize quality, latency, & cost - Oct 06, 2025 963 -
Building a High‑Quality Synthetic Data Pipeline for Supervised Fine‑Tuning - Oct 06, 2025 996 -
Code Generation with Large Language Models - Fireworks AI Take - Oct 06, 2025 1561 -
DeepSeek R1 Just Got Eyes with Fireworks AI Document Inlining - Oct 06, 2025 2265 -
Three projects, one platform: A developer's winning streak with Fireworks AI - Oct 06, 2025 1600 -
DeepSeek v3 and R1 Model Architecture: Why it's powerful and economical - Oct 06, 2025 1761 -
Building an open-source Browser Agent on Fireworks AI - Oct 06, 2025 2718 -
FireLLaVA: the first commercially permissive OSS LLaVA model - Oct 06, 2025 991 -
Your AI Benchmark is Lying to You. Here's How We Caught It - Oct 06, 2025 4163 -
Introducing Vision-Language Model Fine-tuning: Tailor VLMs to Your Domain - Oct 06, 2025 938 -
Supervised Fine-Tuning (SFT) with LoRA on Fireworks AI: Tutorial - Oct 06, 2025 1034 -
Run bulk async workloads with Fireworks Batch API - Oct 06, 2025 450 -
Distillation with Reasoning: Can DeepSeek R1 Teach Better Than Humans? - Oct 06, 2025 905 -
Qwen3 Decoded: Choosing the Right Model For Your Task - Oct 06, 2025 2790 -
Build for Scale with Fireworks Virtual Cloud (GA) - Oct 06, 2025 1128 -
Building Enterprise-Scale RAG Systems with Fireworks AI and MongoDB Atlas - Oct 06, 2025 1745 -
Building AI agents with the Fireworks Experimentation Platform (GA) and Build SDK (Beta) - Oct 06, 2025 1301 -
FLUX.1 on Fireworks: Fast, frugal, and flexible - Oct 06, 2025 1137 -
LLM Eval Driven Development with Claude Code - Oct 06, 2025 1454 -
Unlock Your Tools: Fireworks Adds OpenAI-Response API with MCP Support (Beta) - Oct 06, 2025 1152 -
Fireworks.ai: Fast, Affordable, Customizable Gen AI Platform - Oct 06, 2025 1614 -
Understanding Embeddings and Reranking at Scale - Oct 06, 2025 1612 -
From text to task: Constrained generation for structured extraction in R1 - Oct 06, 2025 5992 -
LLM Inference Performance Benchmarking (Part 1) - Oct 06, 2025 747 -
Using Model-as-a-Judge for Reward in Reinforcement Fine Tuning - Oct 06, 2025 824 -
GPUs on-demand: Not serverless, not reserved, but some third thing - Oct 06, 2025 1670 -
Announcing Eval Protocol - Oct 06, 2025 829 -
How Upwork and Fireworks deliver faster, smarter proposals for freelancers - Oct 06, 2025 1026 -
Fireworks f1: A breakthrough in complex reasoning with Compound AI - Oct 06, 2025 605 -
How Cursor built Fast Apply using the Speculative Decoding API - Oct 06, 2025 1052 -
Fireworks AI Raises $52M Series B to Lead Industry Shift to Compound AI Systems - Oct 06, 2025 1132 -
Speed, Python: Pick Two. How CUDA Graphs Enable Fast Python Code for Deep Learning - Oct 06, 2025 3679 -
Production-Ready AI Agents with Optimized Inference with AWS AgentCore - Oct 06, 2025 451 -
Doomed to Code: How we Teamed Up with Fireworks AI at MistralAI Hackathon to Conquer the Shores of Hell - Oct 06, 2025 3100 -
Sentient & Fireworks Powers Decentralized AI At Viral Scale - Oct 06, 2025 1412 -
FireAttention V2: 12x faster to make Long Contexts practical for Online Inference - Oct 06, 2025 891 -
Announcing Embeddings and Reranking On Fireworks AI - Oct 15, 2025 899 -
Optimizing Llama 4 Maverick on Fireworks AI - Oct 06, 2025 1205 -
DeepSeek V3.1 now on Fireworks AI! - Oct 06, 2025 653 -
How Enterprises are using Multimodal Models in production with Fireworks - Oct 06, 2025 686 -
Fireworks AI Now Supports NVIDIA NIM Deployments for Blazing AI Inference - Oct 06, 2025 884 -
3D FireOptimizer: Automating the Multi-Dimensional Tradeoffs in LLM Serving - Oct 06, 2025 1447 -
Accelerating Code Completion with Fireworks Fast LLM Inference - Oct 06, 2025 639 -
Real-time, performant code assistance: How Sourcegraph scaled with Fireworks AI - Oct 06, 2025 1220 -
Reinforcement Fine Tuning (Beta): Train expert open models to surpass closed frontier models - Oct 06, 2025 958 -
FireAttention V3: Enabling AMD as a viable alternative for GPU inference - Oct 06, 2025 1910 -
DeepSeek R1: All you need to know 🐳 - Oct 06, 2025 1502 -
Getting Started with Stability’s API Powered by Fireworks - Oct 06, 2025 1040 -
How Notion Cuts Latency 4x and Scales Enterprise AI Workflows with Fireworks AI - Oct 06, 2025 584 -
LLM on the edge: Model picking with Fireworks Eval Protocol + Ollama - Oct 18, 2025 912 -
Fireworks and AMD partner to power the next gen of AI infrastructure on AMD Instinct™ GPUs - Oct 20, 2025 415 -
Deployment Shapes: One-Click Deployment Configured For You - Oct 24, 2025 875 -
We raised $250M To Help Enterprises Own Their AI - Oct 28, 2025 818 -
Accelerate your Vision Pipelines with the new NVIDIA Nemotron Nano 2 VL Model on Fireworks AI - Oct 27, 2025 831 -
Genspark’s Deep Research Agent Outperforms a Frontier Closed Model in Quality and Tool Calls using Fireworks RFT, Achieving a 50% Cost Reduction - Nov 01, 2025 1126 -
40X Faster, and Smarter Outputs: How Vercel Turbocharged their Code Fixing Model with Open Models, Speculative Decoding and Reinforcement Fine Tuning on Fireworks? - Oct 31, 2025 1086 -
Fireworks RFT: Build AI agents with fine-tuned open models that outperform frontier closed models - Nov 11, 2025 1046 -
Modernizing Healthcare with AI: How RADPAIR and Fireworks Unlock Smarter Radiology Workflows - Nov 09, 2025 2408 -
50 Trillion Tokens Per Day: The State of Agent Environments - Nov 19, 2025 2411 -
Fireworks Achieves Triple ISO Certification, giving Enterprises Full Control and Trust in AI at Scale - Nov 20, 2025 771 -
Eval Protocol: RL on your agents, in any environment - Nov 21, 2025 1400 -