Building a Real-Time Shopping Assistant: Turn Live Video into Instant Purchases |
Michael Louis |
Aug 14, 2024 |
2435 |
- |
Using Codestral to Summarize, Correct and Auto-Approve Pull Requests |
Cerebrium Team |
Jun 15, 2024 |
2400 |
- |
Creating a realtime RAG voice agent |
Cerebrium Team |
Jul 21, 2024 |
3262 |
- |
Introduction |
Cerebrium Team |
Apr 09, 2024 |
1163 |
1 |
Installing Python Packages via UV leads to 3.75x increase in build performance |
- |
Feb 15, 2024 |
28 |
- |
Getting better price-performance, latency, and availability on AWS Trn1/Inf2 instances |
Cerebrium Team |
May 20, 2024 |
1950 |
- |
Creating an Executive Assistant using LangChain, LangSmith, Cerebrium and Cal.com |
Michael Louis |
May 19, 2024 |
2482 |
- |
Running Llama 3 8B with TensorRT-LLM on Serverless GPUs |
Michael Louis |
May 16, 2024 |
1410 |
- |
How to Build a Real-Time AI Avatar for Training and Coaching |
Michael Louis |
Sep 17, 2024 |
2529 |
- |
Cerebrium supports HIPAA compliance: A guide for health applications |
Kyle Gani |
Sep 30, 2024 |
1208 |
- |
Benchmarking vLLM, SGLang and TensorRT for Llama 3.1 API |
Michael Louis |
Oct 10, 2024 |
643 |
- |
An Alternative to OpenAI Realtime API for Voice Capabilities |
Michael Louis |
Oct 14, 2024 |
1359 |
7 |
ML apps at scale: ASGI support now available on Cerebrium |
Kyle Gani |
Oct 28, 2024 |
452 |
- |
Overcoming Transcription Challenges for Multilingual AI voice agents |
Michael Louis |
Dec 19, 2024 |
1275 |
- |
Building a Real-time Coding Assistant |
Kyle Gani |
Feb 20, 2025 |
3114 |
- |
Creating a realtime AI Commentator with Cerebrium, LiveKit and Cartesia |
Michael Louis |
Feb 18, 2025 |
4243 |
- |
Deploying Ultravox on Cerebrium for Ultra-low Latency Voice Applications |
Kyle Gani |
Apr 28, 2025 |
1194 |
- |
Orpheus TTS: How to Deploy Orpheus at Scale for Production Inference |
Cerebrium Team |
Aug 29, 2025 |
1756 |
- |
How much does a H100 cost? Cost comparision |
Cerebrium Team |
Aug 29, 2025 |
1026 |
- |
How to Deploy Machine Learning Models: A comprehensive Guide |
Cerebrium Team |
Aug 29, 2025 |
997 |
- |
5 Top Free Hosting Platforms for Python Apps |
Cerebrium Team |
Aug 29, 2025 |
1773 |
- |
Top 5 Serverless GPU providers |
Cerebrium Team |
Aug 29, 2025 |
1055 |
- |
Deploying a global scale, AI voice agent with 500ms latency. |
Cerebrium Team |
Jun 25, 2025 |
1765 |
- |
Integrating PayPal's Model Context Protocol (MCP) into a Real-time Voice Agent |
Cerebrium Team |
Jul 31, 2025 |
2134 |
- |
Alternatives to AWS, GCP and Azure for deploying AI models efficiently |
Cerebrium Team |
May 26, 2025 |
1137 |
- |
Launch Week Day 3: Annoucing Multi-Region Deployments |
Cerebrium Team |
Jul 10, 2025 |
583 |
- |
Introducing Cerebrium run: The Fastest Way to Execute Cloud Code |
Cerebrium Team |
Jul 09, 2025 |
718 |
- |
How much does a H200 cost? 2025 Guide |
Cerebrium Team |
Aug 29, 2025 |
906 |
- |
Cerebrium Raises $8.5M led by Gradient to Scale the Leading High-Performance Serverless AI Platform |
Cerebrium Team |
Jul 08, 2025 |
532 |
- |
How Startups Can Cut AI Infrastructure Costs Without Compromising Performance |
Cerebrium Team |
May 26, 2025 |
462 |
- |
Faster Whisper Transcription: How to Maximize Performance for Real-Time Audio-to-Text |
Cerebrium Team |
Aug 29, 2025 |
1025 |
- |
Deploying Sesame CSM: The Most Realistic Voice Model as an API |
Cerebrium Team |
Aug 29, 2025 |
2253 |
- |
Deploying DeepSeek-R1: A Guide to a Serverless, High-Performaning OpenAI-Compatible Endpoint |
Cerebrium Team |
Aug 29, 2025 |
1229 |
- |