Together AI Blog - Plushcap

Blog URL

www.together.ai/blog

Posts YTD

56 ↑ vs 39 last year

Avg Posts/Month

4.5 since 2023

Monthly Post Volume

Start year: 2022 2023 2024 2025 2026

Post Details

Search:

Title	Author	Published	Words	HN Pts
Evo: Long-context modeling from molecular to genome scale	Eric Nguyen, Michael Poli, Matthew Durrant, Patrick Hsu, Brian Hie	2024-02-27	1,310	2
Can you feel the MoE? Mixtral available with over 100 tokens per …	Together	2023-12-11	323	--
Introducing the Together Embeddings endpoint — Higher accuracy, longer context, and lower …	Together AI	2024-01-11	745	1
Filter responses of any model with Llama Guard or your own safety …	Together	2023-12-10	356	--
Announcing OpenChatKit	Together	2023-03-10	2,765	--
TEAL: Training-Free Activation Sparsity in Large Language Models	James Liu, Pragaash Ponnusamy, Tianle Cai, Han Guo, Yoon Kim, Ben Athiwaratkun	2024-08-28	1,056	--
How Together and Crusoe are reducing the carbon impact of generative AI	Together	2023-04-20	737	--
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores	Dan Fu, Hermann Kumbong, Eric Nguyen, Chris Ré	2023-11-13	1,804	--
Introducing Together Rerank API and exclusive access to Salesforce LlamaRank model for …	Together AI	2024-08-26	1,582	1
FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision	Jay Shah (Colfax Research), Ganesh Bikshandi (Colfax Research), Ying Zhang (Meta), Vijay Thakkar (NVIDIA), Pradeep Ramani (NVIDIA), Tri Dao (Princeton University, Together AI)	2024-07-11	1,753	287
Building your own RAG application using Together AI and Langchain	Together AI	2024-01-11	610	--
Building a personalized code assistant with open-source LLMs using RAG Fine-tuning	Kezhen Chen, Linda He, Ben Athiwaratkun, Jue Wang, Maurice Weber, Heejin Jeong, Yonatan Oren, Michael Poli	2024-06-24	1,333	--
Preparing for the era of 32K context: Early learnings and explorations	Together	2023-07-28	1,831	--
Long context retrieval models with Monarch Mixer	Jon Saad-Falcon, Dan Fu, Simran Arora	2024-01-11	2,583	--
Building your own RAG application using Together AI and LlamaIndex	Together AI	2024-01-11	615	--
Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding	Zhuoming Chen, Avner May, Ruslan Svirschevski, Yuhsun Huang, Max Ryabinin, Zhihao Jia, Beidi Chen	2024-03-12	616	--
Introducing Together AI Chief Scientist Tri Dao, as he releases FlashAttention-2 to …	Together	2023-07-17	2,001	--
Together AI partners with Meta to release Meta Llama 3 for inference …	Together AI	2024-04-18	602	--
RedPajama-INCITE-3B, an LLM for everyone	Together	2023-05-09	2,281	--
Announcing Together Custom Models. Build a state-of-the-art LLM with Together AI — …	Together	2023-11-13	1,289	--
SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices	Ruslan Svirschevski, Avner May, Zhuoming Chen, Beidi Chen, Zhihao Jia, Max Ryabinin	2024-06-18	1,308	--
The Mamba in the Llama: Distilling and Accelerating Hybrid Models	Junxiong Wang, Daniele Paliotta, Avner May, Alexander M. Rush, Tri Dao	2024-09-09	2,582	4
Together AI launches full stack for developers to build with open-source AI	Together	2023-07-14	645	--
Llama-2-7B-32K-Instruct — and fine-tuning for Llama-2 models with Together API	Together	2023-08-18	1,092	--
Paving the way to efficient architectures: StripedHyena-7B, open source models offering a …	Together	2023-12-08	1,712	221
FlashConv: Speeding up state space models	Dan Fu and Tri Dao	2023-01-23	1,100	--
Together AI partners with Meta to release Llama 3.1 models for inference …	Together AI	2024-07-23	933	--
Flash Attention received the inaugural Stanford open source software award	Together AI	2024-05-22	445	--
Together AI and Snorkel AI empower enterprises to build proprietary LLMs	Together	2023-07-17	664	--
Announcing Together Inference Engine – the fastest inference available	Together AI	2023-11-13	880	2
FlashAttention: Fast and memory-efficient exact attention with IO-Awareness	Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré	2023-05-17	347	--
RedPajama-Data-v2: An open dataset with 30 trillion tokens for training large language …	Together	2023-10-30	2,223	1
Together AI welcomes Kai Mak as CRO to accelerate gen AI adoption …	Vipul Ved Prakash	2024-09-10	706	--
Together MoA — collective intelligence of open-source models pushing the frontier of …	Junlin Wang, Jue Wang, Ben Athiwaratkun, Ce Zhang, James Zou	2024-06-11	1,422	2
CocktailSGD: Fine-tuning foundation models over 500Mbps networks	Jue Wang, Binhang Yuan, Luka Rimanic, Yongjun He, Tri Dao, Beidi Chen, Christopher Re, Ce Zhang	2023-04-24	234	--
RedPajama 7B now available, instruct model outperforms all open 7B models on …	Together	2023-06-06	1,595	--
Dragonfly: A large vision-language model with multi-resolution zoom	Kezhen Chen, Rahul Thapa, Rahul Chalamala, Ben Athiwaratkun, Shuaiwen Leon Song, James Zou	2024-06-06	1,061	143
Mamba-3B-SlimPJ: State-space models rivaling the best Transformer architecture	Tri Dao, Albert Gu	2023-12-12	550	--
Together AI and NVIDIA collaborate to power Llama 3.1 models for enterprises …	Together AI	2024-07-23	612	--
Our $102.5M Series A	VIPUL VED PRAKASH	2023-11-29	895	70
Announcing v1 of our Python SDK	Together AI	2024-04-22	361	--
Announcing $106M round led by Salesforce Ventures	Vipul Ved Prakash	2024-03-13	999	--
Growing to 20 exaflops, Together GPU Clusters help startups and enterprises accelerate …	Together	2023-11-13	929	--
Fine-tuning Llama-3 to get 90% of GPT-4’s performance at a fraction of …	Hassan El Mghari	2024-07-12	1,292	3
Supercharging NVIDIA H200 and H100 GPU Cluster Performance With Together Kernel Collection	Together AI	2024-09-05	1,781	--
Faster inference enables up to 5x price reduction on Together API	Together	2023-08-11	379	--
ThunderKittens: A Simple Embedded DSL for AI kernels	Benjamin Spector, Aaryan Singhal, Simran Arora, Chris Re	2024-05-12	659	--
Llama 3.1: Same model, different results. The impact of a percentage point.	Together AI	2024-07-31	5,632	--
A practitioner's guide to testing and running large GPU clusters for training …	Ryan Lucchese, Niki Birkner, Yaron Hagai, Virginia Adams	2024-08-13	2,068	80
Releasing 3B and 7B RedPajama-INCITE family of models including base, instruction-tuned & …	Together	2023-05-05	3,989	--
Speculative decoding for high-throughput long-context inference	Jian Chen, Vashisth Tiwari, Ranajoy Sadhukhan, Yunho Jin, Zhuoming Chen, Jinyuan Shi, Ian En-Hsu Yen, Avner May, Beidi Chen	2024-09-05	2,002	2
BitDelta: Your Fine-Tune May Only Be Worth One Bit	James Liu, Guangxuan Xiao, Kai Li, Jason D. Lee, Song Han, Tri Dao, Tianle Cai	2024-02-20	1,690	--
Hyena Hierarchy: Towards larger convolutional language models	Michael Poli, Stefano Massaroli, Eric Nguyen, Daniel Y. Fu, Tri Dao, Stephen Baccus, Yoshua Bengio, Stefano Ermon, Christopher Ré	2023-02-02	291	--
BASED: Simple linear attention language models balance the recall-throughput tradeoff	Simran, Sabri, Michael, Aman, Silas, Dylan, James, Atri, Chris	2024-03-04	2,303	165
Fine-tuning language models over slow networks using activation compression with guarantees	Jue Wang, Binhang Yuan, Luka Rimanic, Yongjun He, Tri Dao, Beidi Chen, Christopher Re, Ce Zhang	2023-06-02	336	--
RedPajama training progress at 440 billion tokens	Together	2023-04-24	1,090	--
RedPajama, a project to create leading open-source models, starts by reproducing LLaMA …	Together	2023-04-17	1,032	--
FAQ: Building LLMs with RedPajama-v2, a 30 trillion token web dataset	Together AI	2024-05-01	2,248	--
FlexGen: High-throughput generative inference of large language models with a single GPU	Ying Sheng, Lianmin Zheng, Binhang Yuan, Zhuohan Li, Max Ryabinin, Daniel Y. Fu, Zhiqiang Xie, Beidi Chen, Clark Barrett, Joseph E. Gonzalez, Percy Liang, Christopher Ré, Ion Stoica, Ce Zhang	2023-03-13	317	--
Flash-Decoding for long-context inference	Tri Dao, Daniel Haziza, Francisco Massa, Grigory Sizov	2023-10-12	1,271	--
Together AI partners with Snowflake to bring Arctic LLM to Enterprise customers	Together AI	2024-04-25	422	--
Decentralized training of foundation models in heterogeneous environments	Binhang Yuan, Yongjun He, Jared Quincy Davis, Tianyi Zhang, Tri Dao, Beidi Chen, Percy Liang, Christopher Re, Ce Zhang	2023-06-02	340	--
Building your own RAG application using Together AI and MongoDB Atlas	Together AI	2024-01-11	1,249	--
Announcing Together Inference Engine 2.0 with new Turbo and Lite endpoints	Together AI	2024-07-18	1,802	3
Using Axiomic to build multi agent chat with Together API	Together AI	2024-06-05	1,169	--
Monarch Mixer: A new model architecture for increased efficiency	Dan Fu, Simran Arora, Chris Ré	2023-07-25	1,981	--
Medusa: Simple framework for accelerating LLM generation with multiple decoding heads	Tianle Cai, Yuhong Li, Zhengyang Geng, Hongwu Peng, Tri Dao (* Equal contribution)	2023-09-11	2,817	--
OpenChatKit now runs on consumer GPUs with a new 7B parameter model	Together	2023-03-30	2,310	--
Announcing function calling and JSON mode	Together AI	2024-01-31	1,861	--
Together’s $20M seed funding to build open-source AI and cloud platform	Vipul Ved Prakash	2023-05-15	602	--
Introducing The Together Enterprise Platform: Run GenAI securely in any environment, with …	Together AI	2024-09-23	1,356	--
Together AI launches Llama 3.2 APIs for vision, lightweight models & Llama …	Together AI	2024-09-25	1,482	1
FLUX API is now available on Together AI: New FLUX1.1 [pro] and …	Together AI	2024-10-03	694	1
Multimodal Document RAG with Llama 3.2 Vision and ColQwen2	Zain Hasan	2024-10-08	1,613	--
How to build a real-time image generator with Flux and Together AI	Hassan El Mghari	2024-10-11	1,197	--
Linearizing LLMs with LoLCATs	Michael Zhang, Simran Arora, Rahul Chalamala, Alan Wu, Benjamin Spector, Aaryan Singhal, Krithik Ramesh, Christopher Ré	2024-10-14	2,462	1
Even Better, Even Faster Quantized LLMs with QTIP	Albert Tseng, Qingyao Sun, David Hou, Chris De Sa	2024-10-30	3,170	--
Together AI to Co-Build Turbocharged NVIDIA GB200 Cluster with 36K Blackwell GPUs …	Together AI	2024-11-18	1,230	--
[COMING SOON] FLUX Tools now available via Together APIs: Get greater control …	Together AI	2024-11-21	216	--
Fine-Tuning LLMs for Multi-Turn Conversations: A Technical Deep Dive	Artem Chumachenko, Zain Hasan, Max Ryabinin	2024-11-25	2,206	3
Long Context Fine-Tuning: A Technical Deep Dive	George Grigorev, Zain Hasan, Max Ryabinin	2024-11-25	1,435	--
Fine-tuning API: Introducing long-context training, conversation data support and more configuration options	Max Ryabinin, Artem Chumachenko, George Grigorev, Arsh Zahed, Gleb Vazhenin	2024-11-25	1,726	--
AWS Marketplace now offering Together AI to accelerate enterprise AI development	Together AI	2024-12-02	415	--
Announcing Llama 3.3 70B, with enhanced reasoning, mathematics, and instruction-following on Together …	Together AI	2024-12-06	500	--
Together AI acquires CodeSandbox to launch first-of-its-kind code interpreter for generative AI	Together AI	2024-12-12	932	3
Announcing Serverless Multi-LoRA: Fine-tune and deploy hundreds of adapters for model customization …	Together AI	2024-12-18	1,224	--
Build ultra low latency voice AI applications with Together AI and Cartesia …	Together AI	2025-01-23	829	--
How to deploy DeepSeek-R1 and distilled models securely on Together AI	Together AI	2025-01-31	1,004	--
Mistral Small 3 API now available on Together AI: A new category …	Together AI	2025-01-30	712	--
Generate images with specific styles using Flux LoRAs on Together AI	Together AI	2025-01-27	891	--
Deploy DeepSeek-R1 at scale: Fast, secure serverless APIs and large-scale Together Reasoning …	Together AI	2025-02-12	984	--
Together AI Achieves 90% Faster BF16 Training with NVIDIA Blackwell Platform and …	Together AI	2025-02-13	1,422	--
How Zomato built an AI customer support bot that doubled customer satisfaction …	Together AI	2024-10-03	1,921	--
Minions: embracing small LMs, shifting compute on-device, and cutting cloud costs in …	Avanika Narayan, Dan Biderman, Sabri Eyuboglu*, Avner May, Scott Linderman, James Zou, Christopher Ré	2025-02-25	1,257	--
Together AI Announces $305M Series B to Scale AI Acceleration Cloud for …	Together AI	2025-02-20	808	--
Together AI becomes NVIDIA Cloud Partner to bolster accelerated AI offerings	Together AI	2025-03-11	744	--
ThunderKittens Now Optimized for NVIDIA Blackwell GPUs	Benjamin Spector, Aaryan Singhal, Dan Fu, Chris Ré	2025-03-15	1,573	--
On-demand dedicated endpoints: run inference with unmatched price-performance & control at scale	Together AI	2025-03-13	1,191	--
Introducing Together Instant GPU Clusters Accelerated by NVIDIA GPUs, with Self-Service Provisioning …	Together AI	2025-03-18	800	--
Together AI Powers Pioneers at GTC: NVIDIA Blackwell GPUs, Instant GPU Clusters, …	Together AI	2025-03-18	1,836	--
Deploy Leading AI Models Accelerated by NVIDIA NIM on Together AI	Together AI	2025-03-18	744	--
Introducing Together Chat: use DeepSeek R1 for free, hosted in North America	Hassan El Mghari	2025-03-24	648	--
Together AI Awarded ClusterMAX™ Gold Rating by SemiAnalysis	Together AI	2025-03-27	973	--
Together AI partners with Meta to offer Llama 4: SOTA Multimodal MoE …	Together AI	2025-04-05	608	--
Scaling AI Companions: How Dippy AI Reached Over 4 Million Tokens/Minute with …	Together AI	2025-04-01	1,074	--
DeepCoder: A Fully Open-Source 14B Coder at O3-mini Level	Michael Luo, Sijun Tan, Roy Huang, Ameen Patel, Alpay Ariyak, Qingyang Wu, Xiaoxiang Shi, Rachel Xin, Colin Cai, Maurice Weber, Ce Zhang, Li Erran Li, Raluca Ada Popa, Ion Stoica	2025-04-08	2,870	31
Together Fine-Tuning Platform, Now With Preference Optimization and Continued Training	Anirudh Jain, Ivan Provilkov, Artem Chumachenko, Alex Moldovan, George Grigorev, Gleb Vazhenin, Arsh Zahed, Avner May, Tristan Dubbeld, Max Ryabinin	2025-04-17	1,360	--
Direct Preference Optimization	Ivan Provilkov, Zain Hasan, Max Ryabinin	2025-04-17	1,472	37
Continued Fine-tuning of LLMs	Artem Chumachenko, Zain Hasan, Max Ryabinin	2025-04-17	1,292	--
Open Deep Research	Together AI	2025-04-16	3,100	--
Chipmunk: Training-Free Acceleration of Diffusion Transformers with Dynamic Column-Sparse Deltas	Austin Silveria, Soham Govande, Dan Fu	2025-04-21	1,550	--
Salesforce, Zoom, InVideo Train Faster with Together AI Turbocharged with NVIDIA Blackwell	Together AI	2025-04-24	1,145	--
Together AI acquires Refuel.ai to unlock data for developers and businesses building …	Together AI	2025-05-15	792	--
Together Code Sandbox: the most robust infrastructure for building AI coding products …	Together AI	2025-05-20	920	1
Together Code Interpreter: execute LLM-generated code seamlessly with a simple API call	Together AI	2025-05-20	946	--
Introducing Together Code Sandbox & Together Code Interpreter: SOTA code execution for …	Together AI	2025-05-20	1,112	--
Boosting DeepSeek-R1’s Speed with Customized Speculative Decoding	Wai Tong Chung, Dan Waters, Avner May, Ben Athiwaratkun	2025-05-12	1,284	--
FLUX.1 Kontext models: Character consistency and precise image editing without fine-tuning	Together AI	2025-05-29	734	--
From AWS to Together Dedicated Endpoints: Arcee AI's journey to greater inference …	Together AI	2025-05-05	1,448	--
Mixture-of-Agents Alignment: Harnessing the Collective Intelligence of Open-Source LLMs to Improve Post-Training	Junlin Wang, Roy Xie, Shang Zhu, Jue Wang, Ben Athiwaratkun, Bhuwan Dhingra, Shuaiwen Leon Song, Ce Zhang, James Zou	2025-05-28	1,394	--
Model-Preserving Adaptive Rounding with YAQA	Albert Tseng, Zhaofeng Sun, and Chris De Sa	2025-06-05	2,091	--
How to Build a Coding Agent from Scratch: A Practical Guide for …	Zain Hasan	2025-06-12	1,060	--
Introducing the Together AI Batch API: Process Thousands of LLM Requests at …	TOGETHER AI	2025-06-11	637	--
The Frontier is Open	Charles Zedlewski	2025-06-09	1,351	1
Bringing 100,000 GPUs to Europe	Together AI	2025-06-12	731	--
Announcing the Together AI Startup Accelerator, purpose-built for AI Native Apps	Deveaux Barron, Prem Prakash	2025-10-15	671	--
Together AI delivers fastest inference for the top open-source models	Jue Wang, Wai Tong Chung, Chenxi Li, Chandra Mourya, John Heo, Shirley Wu, Alaskar Alizada, Rupert Wu, Roy Yuan, Pragaash Ponnusamy, Ben Athiwaratkun, Leon Song	2025-12-01	870	--
How to choose the right open model for production	Nicholas Broad, Dan Waters	2026-01-08	1,617	--
Fine-Tuning Platform Upgrades: Larger Models, Longer Contexts, Enhanced Hugging Face Integrations	Artem Chumachenko, Maksim Abraham, Soroush Bassam, Gleb Vazhenin, Egor Timofeev, Conner Manuel, Zain Hasan, Will Van Eaton, Max Ryabinin	2025-09-10	1,410	--
Together Evaluations: Benchmark Models for Your Tasks	Ivan Provilkov, Ruslan Khaidurov, Kirah Sapong, George Grigorev, Gleb Vazhenin, Yogish Baliga, Zain Hasan, Max Ryabinin	2025-07-28	1,176	--
Transform OpenAI gpt-oss Models into Domain Experts with Together AI Fine-Tuning	Maksim Abraham, Conner Manuel, Eddie Hou, Will Van Eaton, Max Ryabinin	2025-08-19	635	--
Announcing General Availability of Together Instant Clusters, offering ready to use, self-service …	Nikitha Suryadevara, Clark Zinzow, Ryan Pollock	2025-09-09	1,061	--
Announcing native availability of NVIDIA Nemotron 3 Nano, NVIDIA's latest reasoning model	Together AI	2025-12-15	515	--
Expanding Together AI Model Library into multimedia generation with 40+ new image …	Justin Driemeyer, Necoline Hubner, Derek Petersen, Blaine Kasten, Rishabh Bhargava, Sonny Khan	2025-10-21	903	--
Powering Secure AI: Together AI Achieves SOC 2 Type 2 Compliance	Derek Chamorro, Head of Security, Together AI	2025-07-08	325	--
Rime voice models now available on Together AI	Arielle Fidel, Rajas Bansal, Sahil Yadav, Rishabh Bhargava, Sonny Khan	2025-12-18	1,231	--
Back to The Future: Evaluating AI Agents on Predicting Future Events	Federico Bianchi, Junlin Wang, Zain Hasan, Shang Zhu, Roy Yuan, Clémentine Fourrier, James Zou	2025-07-17	1,867	--
Inside multi-node training: How to scale model training across GPU clusters	Andrew Way, Gagan Gill	2026-01-12	979	--
FLUX.2: Multi-reference image generation now available on Together AI	Necoline Hubner, Sonny Khan, Rishabh Bhargava	2025-11-25	849	--
Announcing the fastest inference for realtime voice AI agents	Rajas Bansal, Sahil Yadav, Garima Dhanania, Sri Yanamandra, Charles Zedlewski, Zain Hasan, Derek Petersen, Blaine Kasten, Sonny Khan, Rishabh Bhargava	2025-11-04	1,148	--
DeepSWE: Training a Fully Open-sourced, State-of-the-Art Coding Agent by Scaling RL	Michael Luo, Naman Jain, Jaskirat Singh, Sijun Tan, Ameen Patel, Qingyang Wu, Alpay Ariyak, Colin Cai, Tarun Venkat, Shang Zhu, Ben Athiwaratkun, Manan Roongta, Ce Zhang, Li Erran Li, Raluca Ada Popa, Koushik Sen, Ion Stoica	2025-07-02	3,655	--
How Together AI Uses AI Agents to Automate Complex Engineering Tasks: Lessons …	Shang Zhu, Federico Bianchi, Wai Tong Chung, Zain Hasan, Rupert Wu, Ce Zhang, James Zou, Ben Athiwaratkun	2025-08-21	2,208	--
Together AI welcomes Mahadev Konar as SVP for Infrastructure Engineering	Vipul Ved Prakash	2025-09-10	479	--
How to Build a State-of-the-Art Search Stack for LLMs: RAG, Reranking, and …	Together AI	2026-01-13	725	--
Announcing Together Python SDK v2.0	Blaine Kasten, Zain Hasan	2025-12-12	1,369	--
Large Reasoning Models Fail to Follow Instructions During Reasoning: A Benchmark Study	Yongchan Kwon, Shang Zhu, Federico Bianchi, Kaitlyn Zhou, James Zou	2025-10-22	2,221	--
Improved Batch Inference API: Enhanced UI, Expanded Model Support, and 3000Ã Rate …	Rajas Bansal, Mitali Meratwal, Nikitha Suryadevara, Will Van Eaton, Rishabh Bhargava	2025-09-15	374	--
How to evaluate and benchmark Large Language Models (LLMs)	Zain Hasan	2025-11-04	2,180	--
Dynamic AI agent testing for the real world with Collinear Simulations and …	Anand Kumar, Muyu He, Nazneen Rajani, Zain Hasan, Ivan Provilkov	2025-10-28	589	--
Together AI and Meta partner to bring PyTorch Reinforcement Learning to the …	Together AI Training and Research, The PyTorch team at Meta	2025-12-03	305	--
DeepSeek-V3.1: Hybrid Thinking Model Now Available on Together AI	Together AI	2025-08-27	650	--
Together AI Delivers Top Speeds for DeepSeek-R1-0528 Inference on NVIDIA Blackwell	Together AI	2025-07-17	1,527	--
Research POV: Yes, AGI Can Happen – A Computational Perspective	Together AI	2025-12-17	252	--
How to run TorchForge reinforcement learning pipelines in the Together AI Native …	Together AI Training and Research, The PyTorch team at Meta	2025-12-03	546	--
Introducing AutoJudge: Streamlined inference acceleration via automated dataset curation	ROMAN GARIPOV, FEDOR VELIKONIVTSEV, IVAN ERMAKOV, RUSLAN SVIRSCHEVSKI, VAGE EGIAZARIAN, MAX RYABININ	2025-12-03	1,077	--
OpenAI's New Open gpt-oss Models vs o4-mini: A Real-World Comparison	Hassan El Mghari	2025-08-11	810	--
Together AI Launches Speech-to-Text: High-Performance Whisper APIs	Rajas Bansal, Rishabh Bhargava, Sonny Khan	2025-07-10	592	--
Qwen3-Coder: The Most Capable Agentic Coding Model Now Available on Together AI	Together AI	2025-07-25	643	--
Fine-Tuning Small Open-Source LLMs to Outperform Large Closed-Source Models by 60% on …	Charles O'Neill, Mudith Jayasekara, David Nugent, James Zou	2025-08-15	1,571	--
MiniMax Speech 2.6 Turbo now available natively on Together AI	Arielle Fidel, Rajas Bansal, Sahil Yadav, Rishabh Bhargava, Sonny Khan	2025-12-23	1,248	--
VirtueGuard: Enterprise-Grade AI Security and Safety Now on Together AI	Together AI	2025-07-29	608	--
From Zero to One: Building An Autonomous and Open Data Scientist Agent …	Federico Bianchi, Shang Zhu, Zain Hasan, Ben Athiwaratkun and James Zou	2025-06-12	3,316	--
Announcing the Availability of OpenAI's Open Models on Together AI	Vipul Ved Prakash	2025-08-05	1,010	--
Kimi K2: Leading Open-Source Model Now Available on Together AI	Together AI	2025-07-14	910	--
AdapTive-LeArning Speculator System (ATLAS): A New Paradigm in LLM Inference via Runtime-Learning …	Junxiong Wang, Shirley Wu, Zelei Shao, Vikranth Srivatsa, Jue Wang, Roy Yuan, Qingyang Wu, Alpay Ariyak, Rupert Wu, Wai Tong Chung, Chenfeng Xu, Yonatan Oren, Pragaash Ponnusamy, Yineng Zhang, Avner May, Leon Song, Tri Dao, Percy Liang, Ce Zhang, Ben Athi	2025-10-10	2,048	--
Learn how Cursor partnered with Together AI to deliver real-time, low-latency inference …	Dan Fu, Ingrid Xu, Ce Zhang, Cyrus Lalkaka, Sonny Khan	2026-01-13	683	--
Optimizing inference speed and costs: Lessons learned from large-scale deployments	David Nugent, Ingrid Xu	2026-01-22	1,234	--
DSGym: A holistic framework for evaluating and training data science agents	Fan Nie, Junlin Wang, Harper Hua, Federico Bianchi, Yongchan Kwon, Zhenting Qi, Owen Queen, Shang Zhu, James Zou	2026-01-26	1,270	--
Together Evaluations now supports comparing top commercial APIs vs. open source models	Ivan Provilkov, Conner Manuel, Kirah Sapong, Ruslan Khaidurov, Jasmine Li, Zain Hasan, Jennifer Wu, Max Ryabinin	2026-02-02	634	--
Fine-tuning open LLM judges to outperform GPT-5.2	Zain Hasan, Jasmine Li, Ivan Provilkov	2026-02-02	2,468	--
Together AI welcomes Alon Gavrielov as VP of Infrastructure Strategy	Vipul Ved Prakash	2026-02-03	476	--
Rime Arcana V3 Turbo and Rime Arcana V3 now available on Together …	Sahil Yadav, Arielle Fidel, Rajas Bansal, Rishabh Bhargava, Sonny Khan	2026-02-04	886	--
TogetherCoder-Preview: SOTA Open Dataset for Training Efficient Agents	Alpay Ariyak, Junda Zhang, Junxiong Wang, Shang Zhu, Federico Bianchi, Sanjana Srivastava, Ashwinee Panda, Siddhant Bharti, Chenfeng Xu, John Heo, Xiaoxia Shirley Wu, James Zhou, Percy Liang, Leon Song, Ce Zhang, Ben Athiwaratkun, Zhongzhu Zhou, Qingyan	2026-02-05	3,143	--
What do LLMs think when you don't tell them what to think …	Yongchan Kwon and James Zou	2026-02-06	1,143	--
Cache-aware disaggregated inference for long-context LLM serving	Jiejing Zhang, Yubo Wang, Yinghui Liu, Mourya Vangala Srinivasa, Chenxi Li, Jue Wang, Yineng Zhang, Shuaiwen Leon Song, Ce Zhang	2026-02-11	1,975	--
Introducing Dedicated Container Inference: Delivering 2.6x faster inference for custom AI models	Sylvie Liberman, Rasul Nabiyev, Mohamad Rostami, Dulaj Disanayaka, Will Van Eaton, Nikitha Suryadevara	2026-02-12	952	--
Consistency diffusion language models: Up to 14x faster inference without sacrificing quality	Minseo Kim, Chenfeng Xu, Coleman Richard Charles Hooper, Harman Singh, Ben Athiwaratkun, Ce Zhang, Kurt Keutzer, Amir Gholami \| Seoul National University, University of California, Berkeley, Together AI	2026-02-19	1,316	--
How speech models fail where it matters the most and what to …	Kaitlyn Zhou, Martijn Bartelds, Federico Bianchi, James Zou	2026-02-23	983	--
CoderForge-Preview: SOTA open dataset for training efficient coding agents	Alpay Ariyak, Junda Zhang, Junxiong Wang, Shang Zhu, Federico Bianchi, Sanjana Srivastava, Ashwinee Panda, Siddhant Bharti, Chenfeng Xu, John Heo, Xiaoxia Shirley Wu, James Zou, Percy Liang, Leon Song, Ce Zhang, Ben Athiwaratkun, Zhongzhu Zhou, Qingyang	2026-02-25	3,083	--
Key research and product announcements at the AI Native Conf	Together AI	2026-03-05	2,407	--
FlashAttention-4: Algorithm and Kernel Pipelining Co-Design for Asymmetric Hardware Scaling	Together AI	2026-03-05	3,416	--
Introducing Together AI’s new look	Together AI	2026-03-05	1,372	--
Best practices to accelerate inference for large-scale production workloads	Together AI	2026-03-05	4,850	--
Optimizing Training Workloads for GPU Clusters	Together AI	2026-03-05	1,805	--
New in Together GPU Clusters: Autoscaling, observability, and self-healing	Together AI	2026-03-11	1,799	--
Together AI Brings NVIDIA Nemotron 3 to Developers on Day 0	Together AI	2026-03-11	1,674	--
Build real-time voice agents on Together AI	Together AI	2026-03-13	1,796	--
Together AI at NVIDIA GTC 2026: Explore our latest innovations across research …	Together AI	2026-03-17	1,618	--
Mamba-3	Together AI	2026-03-18	4,544	--
Together AI expands fine-tuning service with tool calling, reasoning, and vision support	Together AI	2026-03-19	2,889	--
Divide, conquer, and plan: How weaker models beat GPT-4o on long context …	Together AI	2026-03-25	2,606	--
Plan, divide, and conquer: How weak models excel at long context tasks	Together AI	2026-03-27	2,607	--
Aurora	Together AI	2026-04-01	3,258	--
Inside the Together AI kernels team	Together AI	2026-04-01	3,484	--
AI for Systems: Using LLMs to Optimize Database Query Execution	Together AI	2026-04-03	3,542	--
Deepgram speech-to-text and voice models now available natively on Together AI	Together AI	2026-04-04	2,888	--
Wan 2.7 now available on Together AI	Together AI	2026-04-04	2,634	--
What is an AI Native Cloud?	Together AI	2026-04-08	3,096	--
EinsteinArena: Harnessing the collective intelligence of agents in the wild to advance …	Together AI	2026-04-13	4,123	--
Parcae: Doing more with fewer parameters using stable looped models	Together AI	2026-04-16	3,427	--
Accelerate RL rollouts by up to 50% with distribution-aware speculative decoding	Together AI	2026-04-21	2,872	--
Capacity without conflict: A guide to multi-tenant GPU cluster design for AI-native …	Together AI	2026-04-22	3,108	--
Together AI Brings NVIDIA Nemotron 3 Nano Omni to Developers on Day …	Together AI	2026-04-29	2,510	--
Announcing Together AI and Adaption Partnership	Together AI	2026-04-30	2,210	--
DeepSeek-V4 Pro now available on Together AI	Together AI	2026-04-30	2,978	--
Benchmarking inference at scale: coding agents	Together AI	2026-04-30	2,862	--
From 732 bytes to nowhere: shutting down Copy Fail in production	Together AI	2026-05-01	2,599	--
Foundational research powering efficient inference at scale	Together AI	2026-05-05	3,356	--
Deploy and inference any model from HuggingFace	Together AI	2026-05-09	851	--
Serving DeepSeek-V4: why million-token context is an inference systems problem	Together AI	2026-05-09	2,573	--
Introducing voice finder — a new tool to quickly find the right …	Together AI	2026-05-13	394	--
Violin: An open-source video translation skill that breaks language barriers	Together AI	2026-05-15	909	--
Together AI and Pearl Research Labs Team Up to Reduce the Cost …	Together AI	2026-05-16	305	--
How Together AI built the world’s fastest speech-to-text stack	Together AI	2026-05-29	1,646	--
Serving MiniMax-M3 for efficient inference: Unlocking 1M-Token Context and Multimodality Without Regrets	Together AI	2026-06-02	1,652	--
Building trust in enterprise AI: Together AI earns ISO 27001:2022 certification	Together AI	2026-06-10	398	--
ParallelKernelBench: Frontier LLMs can't write fast multi-GPU kernels (yet)	Together AI	2026-06-11	2,166	--
Kimi K2.7 Code vs Claude Fable 5: Landing pages that cost 94% …	Together AI	2026-06-17	1,063	--

Plushcap, by Matt Makai. 2021-2026.