18 blog posts published by month since the start of 2025. Start from a different year:

Posts year-to-date
18 (33 posts by this month last year.)
Average posts per month since 2025
1.5

Post details (2025 to today)

Title Author Date Word count HN points
Driving model performance optimization: 2024 highlights Pankaj Gupta Jan 14, 2025 1530 -
Private, secure DeepSeek-R1 in production in US & EU data centers Amir Haghighat, Philip Kiely Feb 11, 2025 1274 -
Testing Llama 3.3 70B inference performance on NVIDIA GH200 in Lambda Cloud Pankaj Gupta, Philip Kiely Feb 11, 2025 1033 -
Baseten Chains is now GA for production compound AI systems Marius Killinger, Tyron Jung, Rachel Rapp Feb 12, 2025 1123 -
How multi-node inference works for massive LLMs like DeepSeek-R1 Phil Howes, Philip Kiely Feb 15, 2025 1303 -
Announcing Baseten’s $75M Series C Tuhin Srivastava Feb 26, 2025 739 -
How we built high-throughput embedding, reranker, and classifier inference with TensorRT-LLM Michael Feil, Philip Kiely Mar 28, 2025 2035 -
Introducing Baseten Embeddings Inference: The fastest embeddings solution available Michael Feil, Rachel Rapp Mar 28, 2025 782 -
The best open-source embedding models Philip Kiely Apr 07, 2025 1254 -
Building performant embedding workflows with Chroma and Baseten Philip Kiely Apr 11, 2025 570 -
Accelerating inference with NVIDIA B200 GPUs Philip Kiely Apr 23, 2025 857 -
Canopy Labs selects Baseten as preferred inference provider for Orpheus TTS models Philip Kiely May 07, 2025 1350 -
Introducing Model APIs and Training May 24, 2025 525 -
Introducing our new brand May 25, 2025 258 -
Day zero benchmarks for Qwen 3 with SGLang on Baseten Yineng Zhang May 19, 2025 1303 -
How Baseten multi-cloud capacity management (MCM) unifies deployments Rachel Rapp Jun 10, 2025 935 -
Forward deployed engineering on the frontier of AI Vlad Shulman Jun 11, 2025 2108 -
Your client code matters: 12x higher embedding throughput with Python and Rust Michael Feil Jun 13, 2025 1280 -