Deepinfra Blog - Plushcap

Blog URL

deepinfra.com/blog

Posts YTD

75 ↑ vs 6 last year

Avg Posts/Month

0.0 since 2026

Monthly Post Volume

Start year: 2023 2024 2025 2026

Post Details

Search:

Title	Author	Published	Words	HN Pts
Pricing 101: Token Math & Cost-Per-Completion Explained	Deep	2026-01-13	6,002	--
From Precision to Quantization: A Practical Guide to Faster, Cheaper LLMs	Deep	2026-01-13	2,911	--
How the Models Perform on DeepInfra: Long-Context Performance, Throughput, and Cost	Deep	2026-01-13	1,730	--
Nemotron 3 Nano vs GPT-OSS-20B: Performance, Benchmarks & DeepInfra Results	Deep	2026-01-13	1,673	--
Build an OCR-Powered PDF Reader & Summarizer with DeepInfra (Kimi K2)	Deep	2026-01-13	3,944	--
LLM API Provider Performance KPIs 101: TTFT, Throughput & End-to-End Goals	Deep	2026-01-13	2,103	--
Nemotron 3 Nano Explained: NVIDIA’s Efficient Small LLM and Why It Matters	Deep	2026-01-13	2,280	--
Reliable JSON-Only Responses with DeepInfra LLMs	Deep	2026-02-02	1,713	--
Function Calling for AI APIs in DeepInfra — How to Extend Your …	Deep	2026-02-02	1,496	--
NVIDIA Nemotron API Pricing Guide 2026	Deep	2026-02-02	1,280	--
Best API for Kimi K2.5: Why DeepInfra Leads in Speed, TTFT, and …	Deep	2026-02-02	1,716	--
Build a Streaming Chat Backend in 10 Minutes	Deep	2026-02-02	2,435	--
Qwen API Pricing Guide 2026: Max Performance on a Budget	Deep	2026-02-02	1,412	--
Building Efficient AI Inference on NVIDIA Blackwell Platform	Deep	2026-02-12	1,084	--
Introducing NVIDIA Nemotron 3 Super on DeepInfra	Aray Sultanbekova	2026-03-11	938	--
Qwen3.5 27B API Benchmarks: Latency, Throughput & Cost	Deep	2026-04-03	1,375	--
Qwen3.5 9B API Benchmarks: Latency, Throughput & Cost	Deep	2026-04-03	1,298	--
Qwen3.5 4B via DeepInfra: Latency, Throughput & Cost	Deep	2026-04-03	1,099	--
GLM-5 API Benchmarks: Latency, Throughput & Cost	Deep	2026-04-03	1,543	--
Kimi K2 0905 API Benchmarks: Latency, Throughput & Cost	han	2026-04-03	1,465	--
NVIDIA Nemotron 3 Super 120B API Benchmarks: Latency & Cost	Deep	2026-04-03	1,697	--
Qwen3 Coder 480B A35B API Benchmarks: Latency & Cost	Deep	2026-04-03	1,498	--
MiniMax-M2.5 API Benchmarks: Latency, Throughput & Cost	Deep	2026-04-03	1,853	--
DeepSeek V3.2 API Benchmarks: Latency, Throughput & Cost	Deep	2026-04-03	2,011	--
Kimi K2.5 API Benchmarks: Latency, Throughput & Cost	Deep	2026-04-03	1,701	--
Qwen3.5 122B A10B API Benchmarks: Latency, Throughput & Cost	Deep	2026-04-03	1,361	--
Step 3.5 Flash API Benchmarks: Latency, Throughput & Cost	Deep	2026-04-03	1,632	--
Qwen3.5 0.8B API Benchmarks: Latency, Throughput & Cost	han	2026-04-03	1,312	--
Qwen3.5 397B A17B API Benchmarks: Latency, Throughput & Cost	Deep	2026-04-03	2,094	--
Qwen3.5 2B via DeepInfra: Latency, Throughput & Cost	Deep	2026-04-03	1,087	--
NVIDIA Nemotron 3 Nano 30B API Benchmarks: Latency & Cost	Deep	2026-04-03	1,256	--
GLM-4.7-Flash API Benchmarks: Latency, Throughput & Cost	Deep	2026-04-03	1,455	--
Qwen3.5 35B A3B API Benchmarks: Latency, Throughput & Cost	Deep	2026-04-03	1,201	--
Best Models for OpenClaw: Top Picks for Agentic Workloads	Deep	2026-04-28	2,642	--
Introducing NVIDIA Nemotron 3 Nano Omni on DeepInfra	Aray Sultanbekova	2026-04-28	1,109	--
What Is Google TurboQuant and What Does It Mean for Open Source …	Deep	2026-04-28	1,988	--
Inference Economics: True AI Costs at Scale	Deep	2026-04-28	1,796	--
Best OpenClaw Alternatives: Hermes Agent, ZeroClaw & NemoClaw	Deep	2026-04-28	2,193	--
How to Use OpenClaw with DeepInfra: Setup & Workflow Guide	Deep	2026-04-28	2,392	--
DeepInfra is now a supported Hugging Face Inference Provider	Aray Sultanbekova	2026-04-29	903	--
DeepSeek V4 Pro: Model Overview, Features & Performance Guide	Deep	2026-04-30	1,108	--
Kimi K2.6 is Now Available on DeepInfra	Deep	2026-04-30	1,477	--
DeepSeek V4 Pro (Max) API Benchmarks: Latency, Throughput & Cost Analysis	Deep	2026-04-30	2,101	--
Kimi K2.6 Model Overview: Architecture, Features & Capabilities	Deep	2026-04-30	1,323	--
Open vs Closed Source AI Models: Intelligence, Price & Speed Compared	Deep	2026-04-30	2,233	--
Kimi K2.6 API Benchmarks: Latency, TPS & Cost Analysis (2026)	Deep	2026-04-30	2,191	--
DeepSeek V4 Pro Is Now Available on DeepInfra	Deep	2026-04-30	1,530	--
Kimi K2.6 Pricing Guide 2026: Compare Costs & Deployment Strategies	Deep	2026-04-30	3,462	--
DeepSeek V4 Pro Pricing Guide 2026: Pricing, Providers & Cost Comparison	Deep	2026-04-30	3,759	--
We've Raised $107M to Build the Inference Cloud the AI Era Actually …	Yessen Kanapin	2026-05-04	952	--
Best API Providers for GLM-5.1 in 2026	Deep	2026-05-25	1,509	--
GLM-5.1 Model Overview: Features, Capabilities & Use Cases	Deep	2026-05-25	1,148	--
Best Kimi K2.6 API Providers for Developers (2026)	Deep	2026-05-25	1,165	--
GLM-5.1 on DeepInfra: Z.AI’s Agentic Engineering Model	Deep	2026-05-25	1,258	--
Gemma 4 on DeepInfra: Fast & Scalable Open AI Models	Deep	2026-05-25	1,488	--
GLM-5.1 API Benchmarks: Latency, Throughput & Cost	Deep	2026-05-25	2,142	--
NVIDIA Nemotron 3 Super on DeepInfra: 120B MoE Model	Deep	2026-05-25	1,486	--
Gemma 4 Model Overview: Features, Architecture & Use Cases	Deep	2026-05-25	1,258	--
Gemma 4 26B A4B API Benchmarks: Latency, Throughput & Cost	Deep	2026-05-25	1,660	--
Gemma 4 Pricing, Benchmarks & Real-World Cost Analysis	Deep	2026-05-25	2,955	--
Best SaaS Platforms for Deploying Gemma 4 in 2026	Deep	2026-05-25	1,467	--
Best API Providers for DeepSeek V4 in 2026	Deep	2026-05-25	1,179	--
Nemotron 3 Super Provider Pricing Comparison (2026)	Deep	2026-05-25	2,359	--
Best API Providers for NVIDIA Nemotron 3 Super 120B	Deep	2026-05-25	1,303	--
NVIDIA Nemotron 3 Super: Model Overview & Integration Guide	Deep	2026-05-25	1,160	--
GLM-5.1 Pricing Guide: API Cost Comparison & Analysis	Deep	2026-05-25	2,337	--
NVIDIA Nemotron 3 Super 120B API Benchmarks	Deep	2026-05-25	1,867	--
Open-Source vs Closed-Source AI Models: Is the Gap Worth It?	Deep	2026-05-26	3,331	--
OpenClaw Security: Prevent Prompt Injection & Supply Chain Attacks	Deep	2026-05-26	2,422	--
How Mixture of Experts Models Changed LLM Economics	Deep	2026-05-26	2,595	--
OpenClaw Use Cases That Deliver Real ROI	Deep	2026-05-26	2,380	--
OpenClaw Cost Optimization: Cut AI API Costs by 90%	Deep	2026-05-26	2,394	--
DeepInfra Launches Access to NVIDIA Cosmos 3 World Foundation Models for Physical …	Yessen Kanapin	2026-06-04	769	--
Nemotron 3 Ultra, 3.5 Content Safety and ASR models are now live …	Yessen Kanapin	2026-06-04	827	--
Step 3.7 Flash is Live on DeepInfra: An Agentic, Multimodal Model Built …	Deep	2026-06-12	910	--

Plushcap, by Matt Makai. 2021-2026.