Together AI Blog - Plushcap

Blog URL

www.together.ai/blog

Posts YTD

56 ↑ vs 39 last year

Avg Posts/Month

5.5 since 2025

Monthly Post Volume

Start year: 2022 2023 2024 2025 2026

Post Details

Search:

Title	Author	Published	Words	HN Pts
Build ultra low latency voice AI applications with Together AI and Cartesia …	Together AI	2025-01-23	829	--
How to deploy DeepSeek-R1 and distilled models securely on Together AI	Together AI	2025-01-31	1,004	--
Mistral Small 3 API now available on Together AI: A new category …	Together AI	2025-01-30	712	--
Generate images with specific styles using Flux LoRAs on Together AI	Together AI	2025-01-27	891	--
Deploy DeepSeek-R1 at scale: Fast, secure serverless APIs and large-scale Together Reasoning …	Together AI	2025-02-12	984	--
Together AI Achieves 90% Faster BF16 Training with NVIDIA Blackwell Platform and …	Together AI	2025-02-13	1,422	--
Minions: embracing small LMs, shifting compute on-device, and cutting cloud costs in …	Avanika Narayan, Dan Biderman, Sabri Eyuboglu*, Avner May, Scott Linderman, James Zou, Christopher Ré	2025-02-25	1,257	--
Together AI Announces $305M Series B to Scale AI Acceleration Cloud for …	Together AI	2025-02-20	808	--
Together AI becomes NVIDIA Cloud Partner to bolster accelerated AI offerings	Together AI	2025-03-11	744	--
ThunderKittens Now Optimized for NVIDIA Blackwell GPUs	Benjamin Spector, Aaryan Singhal, Dan Fu, Chris Ré	2025-03-15	1,573	--
On-demand dedicated endpoints: run inference with unmatched price-performance & control at scale	Together AI	2025-03-13	1,191	--
Introducing Together Instant GPU Clusters Accelerated by NVIDIA GPUs, with Self-Service Provisioning …	Together AI	2025-03-18	800	--
Together AI Powers Pioneers at GTC: NVIDIA Blackwell GPUs, Instant GPU Clusters, …	Together AI	2025-03-18	1,836	--
Deploy Leading AI Models Accelerated by NVIDIA NIM on Together AI	Together AI	2025-03-18	744	--
Introducing Together Chat: use DeepSeek R1 for free, hosted in North America	Hassan El Mghari	2025-03-24	648	--
Together AI Awarded ClusterMAX™ Gold Rating by SemiAnalysis	Together AI	2025-03-27	973	--
Together AI partners with Meta to offer Llama 4: SOTA Multimodal MoE …	Together AI	2025-04-05	608	--
Scaling AI Companions: How Dippy AI Reached Over 4 Million Tokens/Minute with …	Together AI	2025-04-01	1,074	--
DeepCoder: A Fully Open-Source 14B Coder at O3-mini Level	Michael Luo, Sijun Tan, Roy Huang, Ameen Patel, Alpay Ariyak, Qingyang Wu, Xiaoxiang Shi, Rachel Xin, Colin Cai, Maurice Weber, Ce Zhang, Li Erran Li, Raluca Ada Popa, Ion Stoica	2025-04-08	2,870	31
Together Fine-Tuning Platform, Now With Preference Optimization and Continued Training	Anirudh Jain, Ivan Provilkov, Artem Chumachenko, Alex Moldovan, George Grigorev, Gleb Vazhenin, Arsh Zahed, Avner May, Tristan Dubbeld, Max Ryabinin	2025-04-17	1,360	--
Direct Preference Optimization	Ivan Provilkov, Zain Hasan, Max Ryabinin	2025-04-17	1,472	37
Continued Fine-tuning of LLMs	Artem Chumachenko, Zain Hasan, Max Ryabinin	2025-04-17	1,292	--
Open Deep Research	Together AI	2025-04-16	3,100	--
Chipmunk: Training-Free Acceleration of Diffusion Transformers with Dynamic Column-Sparse Deltas	Austin Silveria, Soham Govande, Dan Fu	2025-04-21	1,550	--
Salesforce, Zoom, InVideo Train Faster with Together AI Turbocharged with NVIDIA Blackwell	Together AI	2025-04-24	1,145	--
Together AI acquires Refuel.ai to unlock data for developers and businesses building …	Together AI	2025-05-15	792	--
Together Code Sandbox: the most robust infrastructure for building AI coding products …	Together AI	2025-05-20	920	1
Together Code Interpreter: execute LLM-generated code seamlessly with a simple API call	Together AI	2025-05-20	946	--
Introducing Together Code Sandbox & Together Code Interpreter: SOTA code execution for …	Together AI	2025-05-20	1,112	--
Boosting DeepSeek-R1’s Speed with Customized Speculative Decoding	Wai Tong Chung, Dan Waters, Avner May, Ben Athiwaratkun	2025-05-12	1,284	--
FLUX.1 Kontext models: Character consistency and precise image editing without fine-tuning	Together AI	2025-05-29	734	--
From AWS to Together Dedicated Endpoints: Arcee AI's journey to greater inference …	Together AI	2025-05-05	1,448	--
Mixture-of-Agents Alignment: Harnessing the Collective Intelligence of Open-Source LLMs to Improve Post-Training	Junlin Wang, Roy Xie, Shang Zhu, Jue Wang, Ben Athiwaratkun, Bhuwan Dhingra, Shuaiwen Leon Song, Ce Zhang, James Zou	2025-05-28	1,394	--
Model-Preserving Adaptive Rounding with YAQA	Albert Tseng, Zhaofeng Sun, and Chris De Sa	2025-06-05	2,091	--
How to Build a Coding Agent from Scratch: A Practical Guide for …	Zain Hasan	2025-06-12	1,060	--
Introducing the Together AI Batch API: Process Thousands of LLM Requests at …	TOGETHER AI	2025-06-11	637	--
The Frontier is Open	Charles Zedlewski	2025-06-09	1,351	1
Bringing 100,000 GPUs to Europe	Together AI	2025-06-12	731	--
Announcing the Together AI Startup Accelerator, purpose-built for AI Native Apps	Deveaux Barron, Prem Prakash	2025-10-15	671	--
Together AI delivers fastest inference for the top open-source models	Jue Wang, Wai Tong Chung, Chenxi Li, Chandra Mourya, John Heo, Shirley Wu, Alaskar Alizada, Rupert Wu, Roy Yuan, Pragaash Ponnusamy, Ben Athiwaratkun, Leon Song	2025-12-01	870	--
How to choose the right open model for production	Nicholas Broad, Dan Waters	2026-01-08	1,617	--
Fine-Tuning Platform Upgrades: Larger Models, Longer Contexts, Enhanced Hugging Face Integrations	Artem Chumachenko, Maksim Abraham, Soroush Bassam, Gleb Vazhenin, Egor Timofeev, Conner Manuel, Zain Hasan, Will Van Eaton, Max Ryabinin	2025-09-10	1,410	--
Together Evaluations: Benchmark Models for Your Tasks	Ivan Provilkov, Ruslan Khaidurov, Kirah Sapong, George Grigorev, Gleb Vazhenin, Yogish Baliga, Zain Hasan, Max Ryabinin	2025-07-28	1,176	--
Transform OpenAI gpt-oss Models into Domain Experts with Together AI Fine-Tuning	Maksim Abraham, Conner Manuel, Eddie Hou, Will Van Eaton, Max Ryabinin	2025-08-19	635	--
Announcing General Availability of Together Instant Clusters, offering ready to use, self-service …	Nikitha Suryadevara, Clark Zinzow, Ryan Pollock	2025-09-09	1,061	--
Announcing native availability of NVIDIA Nemotron 3 Nano, NVIDIA's latest reasoning model	Together AI	2025-12-15	515	--
Expanding Together AI Model Library into multimedia generation with 40+ new image …	Justin Driemeyer, Necoline Hubner, Derek Petersen, Blaine Kasten, Rishabh Bhargava, Sonny Khan	2025-10-21	903	--
Powering Secure AI: Together AI Achieves SOC 2 Type 2 Compliance	Derek Chamorro, Head of Security, Together AI	2025-07-08	325	--
Rime voice models now available on Together AI	Arielle Fidel, Rajas Bansal, Sahil Yadav, Rishabh Bhargava, Sonny Khan	2025-12-18	1,231	--
Back to The Future: Evaluating AI Agents on Predicting Future Events	Federico Bianchi, Junlin Wang, Zain Hasan, Shang Zhu, Roy Yuan, Clémentine Fourrier, James Zou	2025-07-17	1,867	--
Inside multi-node training: How to scale model training across GPU clusters	Andrew Way, Gagan Gill	2026-01-12	979	--
FLUX.2: Multi-reference image generation now available on Together AI	Necoline Hubner, Sonny Khan, Rishabh Bhargava	2025-11-25	849	--
Announcing the fastest inference for realtime voice AI agents	Rajas Bansal, Sahil Yadav, Garima Dhanania, Sri Yanamandra, Charles Zedlewski, Zain Hasan, Derek Petersen, Blaine Kasten, Sonny Khan, Rishabh Bhargava	2025-11-04	1,148	--
DeepSWE: Training a Fully Open-sourced, State-of-the-Art Coding Agent by Scaling RL	Michael Luo, Naman Jain, Jaskirat Singh, Sijun Tan, Ameen Patel, Qingyang Wu, Alpay Ariyak, Colin Cai, Tarun Venkat, Shang Zhu, Ben Athiwaratkun, Manan Roongta, Ce Zhang, Li Erran Li, Raluca Ada Popa, Koushik Sen, Ion Stoica	2025-07-02	3,655	--
How Together AI Uses AI Agents to Automate Complex Engineering Tasks: Lessons …	Shang Zhu, Federico Bianchi, Wai Tong Chung, Zain Hasan, Rupert Wu, Ce Zhang, James Zou, Ben Athiwaratkun	2025-08-21	2,208	--
Together AI welcomes Mahadev Konar as SVP for Infrastructure Engineering	Vipul Ved Prakash	2025-09-10	479	--
How to Build a State-of-the-Art Search Stack for LLMs: RAG, Reranking, and …	Together AI	2026-01-13	725	--
Announcing Together Python SDK v2.0	Blaine Kasten, Zain Hasan	2025-12-12	1,369	--
Large Reasoning Models Fail to Follow Instructions During Reasoning: A Benchmark Study	Yongchan Kwon, Shang Zhu, Federico Bianchi, Kaitlyn Zhou, James Zou	2025-10-22	2,221	--
Improved Batch Inference API: Enhanced UI, Expanded Model Support, and 3000Ã Rate …	Rajas Bansal, Mitali Meratwal, Nikitha Suryadevara, Will Van Eaton, Rishabh Bhargava	2025-09-15	374	--
How to evaluate and benchmark Large Language Models (LLMs)	Zain Hasan	2025-11-04	2,180	--
Dynamic AI agent testing for the real world with Collinear Simulations and …	Anand Kumar, Muyu He, Nazneen Rajani, Zain Hasan, Ivan Provilkov	2025-10-28	589	--
Together AI and Meta partner to bring PyTorch Reinforcement Learning to the …	Together AI Training and Research, The PyTorch team at Meta	2025-12-03	305	--
DeepSeek-V3.1: Hybrid Thinking Model Now Available on Together AI	Together AI	2025-08-27	650	--
Together AI Delivers Top Speeds for DeepSeek-R1-0528 Inference on NVIDIA Blackwell	Together AI	2025-07-17	1,527	--
Research POV: Yes, AGI Can Happen – A Computational Perspective	Together AI	2025-12-17	252	--
How to run TorchForge reinforcement learning pipelines in the Together AI Native …	Together AI Training and Research, The PyTorch team at Meta	2025-12-03	546	--
Introducing AutoJudge: Streamlined inference acceleration via automated dataset curation	ROMAN GARIPOV, FEDOR VELIKONIVTSEV, IVAN ERMAKOV, RUSLAN SVIRSCHEVSKI, VAGE EGIAZARIAN, MAX RYABININ	2025-12-03	1,077	--
OpenAI's New Open gpt-oss Models vs o4-mini: A Real-World Comparison	Hassan El Mghari	2025-08-11	810	--
Together AI Launches Speech-to-Text: High-Performance Whisper APIs	Rajas Bansal, Rishabh Bhargava, Sonny Khan	2025-07-10	592	--
Qwen3-Coder: The Most Capable Agentic Coding Model Now Available on Together AI	Together AI	2025-07-25	643	--
Fine-Tuning Small Open-Source LLMs to Outperform Large Closed-Source Models by 60% on …	Charles O'Neill, Mudith Jayasekara, David Nugent, James Zou	2025-08-15	1,571	--
MiniMax Speech 2.6 Turbo now available natively on Together AI	Arielle Fidel, Rajas Bansal, Sahil Yadav, Rishabh Bhargava, Sonny Khan	2025-12-23	1,248	--
VirtueGuard: Enterprise-Grade AI Security and Safety Now on Together AI	Together AI	2025-07-29	608	--
From Zero to One: Building An Autonomous and Open Data Scientist Agent …	Federico Bianchi, Shang Zhu, Zain Hasan, Ben Athiwaratkun and James Zou	2025-06-12	3,316	--
Announcing the Availability of OpenAI's Open Models on Together AI	Vipul Ved Prakash	2025-08-05	1,010	--
Kimi K2: Leading Open-Source Model Now Available on Together AI	Together AI	2025-07-14	910	--
AdapTive-LeArning Speculator System (ATLAS): A New Paradigm in LLM Inference via Runtime-Learning …	Junxiong Wang, Shirley Wu, Zelei Shao, Vikranth Srivatsa, Jue Wang, Roy Yuan, Qingyang Wu, Alpay Ariyak, Rupert Wu, Wai Tong Chung, Chenfeng Xu, Yonatan Oren, Pragaash Ponnusamy, Yineng Zhang, Avner May, Leon Song, Tri Dao, Percy Liang, Ce Zhang, Ben Athi	2025-10-10	2,048	--
Learn how Cursor partnered with Together AI to deliver real-time, low-latency inference …	Dan Fu, Ingrid Xu, Ce Zhang, Cyrus Lalkaka, Sonny Khan	2026-01-13	683	--
Optimizing inference speed and costs: Lessons learned from large-scale deployments	David Nugent, Ingrid Xu	2026-01-22	1,234	--
DSGym: A holistic framework for evaluating and training data science agents	Fan Nie, Junlin Wang, Harper Hua, Federico Bianchi, Yongchan Kwon, Zhenting Qi, Owen Queen, Shang Zhu, James Zou	2026-01-26	1,270	--
Together Evaluations now supports comparing top commercial APIs vs. open source models	Ivan Provilkov, Conner Manuel, Kirah Sapong, Ruslan Khaidurov, Jasmine Li, Zain Hasan, Jennifer Wu, Max Ryabinin	2026-02-02	634	--
Fine-tuning open LLM judges to outperform GPT-5.2	Zain Hasan, Jasmine Li, Ivan Provilkov	2026-02-02	2,468	--
Together AI welcomes Alon Gavrielov as VP of Infrastructure Strategy	Vipul Ved Prakash	2026-02-03	476	--
Rime Arcana V3 Turbo and Rime Arcana V3 now available on Together …	Sahil Yadav, Arielle Fidel, Rajas Bansal, Rishabh Bhargava, Sonny Khan	2026-02-04	886	--
TogetherCoder-Preview: SOTA Open Dataset for Training Efficient Agents	Alpay Ariyak, Junda Zhang, Junxiong Wang, Shang Zhu, Federico Bianchi, Sanjana Srivastava, Ashwinee Panda, Siddhant Bharti, Chenfeng Xu, John Heo, Xiaoxia Shirley Wu, James Zhou, Percy Liang, Leon Song, Ce Zhang, Ben Athiwaratkun, Zhongzhu Zhou, Qingyan	2026-02-05	3,143	--
What do LLMs think when you don't tell them what to think …	Yongchan Kwon and James Zou	2026-02-06	1,143	--
Cache-aware disaggregated inference for long-context LLM serving	Jiejing Zhang, Yubo Wang, Yinghui Liu, Mourya Vangala Srinivasa, Chenxi Li, Jue Wang, Yineng Zhang, Shuaiwen Leon Song, Ce Zhang	2026-02-11	1,975	--
Introducing Dedicated Container Inference: Delivering 2.6x faster inference for custom AI models	Sylvie Liberman, Rasul Nabiyev, Mohamad Rostami, Dulaj Disanayaka, Will Van Eaton, Nikitha Suryadevara	2026-02-12	952	--
Consistency diffusion language models: Up to 14x faster inference without sacrificing quality	Minseo Kim, Chenfeng Xu, Coleman Richard Charles Hooper, Harman Singh, Ben Athiwaratkun, Ce Zhang, Kurt Keutzer, Amir Gholami \| Seoul National University, University of California, Berkeley, Together AI	2026-02-19	1,316	--
How speech models fail where it matters the most and what to …	Kaitlyn Zhou, Martijn Bartelds, Federico Bianchi, James Zou	2026-02-23	983	--
CoderForge-Preview: SOTA open dataset for training efficient coding agents	Alpay Ariyak, Junda Zhang, Junxiong Wang, Shang Zhu, Federico Bianchi, Sanjana Srivastava, Ashwinee Panda, Siddhant Bharti, Chenfeng Xu, John Heo, Xiaoxia Shirley Wu, James Zou, Percy Liang, Leon Song, Ce Zhang, Ben Athiwaratkun, Zhongzhu Zhou, Qingyang	2026-02-25	3,083	--
Key research and product announcements at the AI Native Conf	Together AI	2026-03-05	2,407	--
FlashAttention-4: Algorithm and Kernel Pipelining Co-Design for Asymmetric Hardware Scaling	Together AI	2026-03-05	3,416	--
Introducing Together AI’s new look	Together AI	2026-03-05	1,372	--
Best practices to accelerate inference for large-scale production workloads	Together AI	2026-03-05	4,850	--
Optimizing Training Workloads for GPU Clusters	Together AI	2026-03-05	1,805	--
New in Together GPU Clusters: Autoscaling, observability, and self-healing	Together AI	2026-03-11	1,799	--
Together AI Brings NVIDIA Nemotron 3 to Developers on Day 0	Together AI	2026-03-11	1,674	--
Build real-time voice agents on Together AI	Together AI	2026-03-13	1,796	--
Together AI at NVIDIA GTC 2026: Explore our latest innovations across research …	Together AI	2026-03-17	1,618	--
Mamba-3	Together AI	2026-03-18	4,544	--
Together AI expands fine-tuning service with tool calling, reasoning, and vision support	Together AI	2026-03-19	2,889	--
Divide, conquer, and plan: How weaker models beat GPT-4o on long context …	Together AI	2026-03-25	2,606	--
Plan, divide, and conquer: How weak models excel at long context tasks	Together AI	2026-03-27	2,607	--
Aurora	Together AI	2026-04-01	3,258	--
Inside the Together AI kernels team	Together AI	2026-04-01	3,484	--
AI for Systems: Using LLMs to Optimize Database Query Execution	Together AI	2026-04-03	3,542	--
Deepgram speech-to-text and voice models now available natively on Together AI	Together AI	2026-04-04	2,888	--
Wan 2.7 now available on Together AI	Together AI	2026-04-04	2,634	--
What is an AI Native Cloud?	Together AI	2026-04-08	3,096	--
EinsteinArena: Harnessing the collective intelligence of agents in the wild to advance …	Together AI	2026-04-13	4,123	--
Parcae: Doing more with fewer parameters using stable looped models	Together AI	2026-04-16	3,427	--
Accelerate RL rollouts by up to 50% with distribution-aware speculative decoding	Together AI	2026-04-21	2,872	--
Capacity without conflict: A guide to multi-tenant GPU cluster design for AI-native …	Together AI	2026-04-22	3,108	--
Together AI Brings NVIDIA Nemotron 3 Nano Omni to Developers on Day …	Together AI	2026-04-29	2,510	--
Announcing Together AI and Adaption Partnership	Together AI	2026-04-30	2,210	--
DeepSeek-V4 Pro now available on Together AI	Together AI	2026-04-30	2,978	--
Benchmarking inference at scale: coding agents	Together AI	2026-04-30	2,862	--
From 732 bytes to nowhere: shutting down Copy Fail in production	Together AI	2026-05-01	2,599	--
Foundational research powering efficient inference at scale	Together AI	2026-05-05	3,356	--
Deploy and inference any model from HuggingFace	Together AI	2026-05-09	851	--
Serving DeepSeek-V4: why million-token context is an inference systems problem	Together AI	2026-05-09	2,573	--
Introducing voice finder — a new tool to quickly find the right …	Together AI	2026-05-13	394	--
Violin: An open-source video translation skill that breaks language barriers	Together AI	2026-05-15	909	--
Together AI and Pearl Research Labs Team Up to Reduce the Cost …	Together AI	2026-05-16	305	--
How Together AI built the world’s fastest speech-to-text stack	Together AI	2026-05-29	1,646	--
Serving MiniMax-M3 for efficient inference: Unlocking 1M-Token Context and Multimodality Without Regrets	Together AI	2026-06-02	1,652	--
Building trust in enterprise AI: Together AI earns ISO 27001:2022 certification	Together AI	2026-06-10	398	--
ParallelKernelBench: Frontier LLMs can't write fast multi-GPU kernels (yet)	Together AI	2026-06-11	2,166	--
Kimi K2.7 Code vs Claude Fable 5: Landing pages that cost 94% …	Together AI	2026-06-17	1,063	--

Plushcap, by Matt Makai. 2021-2026.