Toto 2.0: Time series forecasting enters the scaling era

Post Details

Company

Datadog

Date Published

May 14, 2026

Author

Emaad Khwaja, Gerald Woo, Chris Lettieri, Ameet Talwalkar, David Asker

Word Count

3,054

Company Posts That Month

24

Language

English

Hacker News Points

-

Post removed?

No

Source URL

www.datadoghq.com/blog/ai/toto-2

Summary

Toto 2.0, a family of open-weight time series forecasting models released on Hugging Face, ranges from 4 million to 2.5 billion parameters and demonstrates that scaling improves model performance, as evidenced by its top rankings on benchmarks like BOOM, GIFT-Eval, and TIME. The models, which do not rely on public forecasting data for pretraining, show advancements over the previous Toto 1.0 in terms of parameter efficiency and inference speed, particularly through techniques like contiguous patch masking. Toto 2.0 models consistently sit on the Pareto frontier, indicating optimal quality-for-size tradeoffs, and outperform competitive models across various metrics such as CRPS and MASE. The release also includes model weights and infrastructure for distributed training, and it highlights the importance of data curation and the potential for future improvements in areas such as long-horizon stability and multimodal modeling for observability.

Trends Found in this Post

Trend	Post Mentions	Total Month Mentions	Posts	Companies	MoM
Observability	6	3,421	707	180	-24%
AI Model Fine-tuning	3	615	196	69	+46%

Use This Data

Use this post, company, and trend context to find content marketing opportunities, perform competitive analysis, or address product feature gaps via the Plushcap MCP server or the Plushcap API.