Home / Companies / HuggingFace / Blog / Post Details
Content Deep Dive

Getting More from Your Test-Time Compute Budget with Portfolio Beam Search

Blog post from HuggingFace

Post Details
Company
Date Published
Author
Dan Elbaz, Oren Salzman, Oren Pereg, Daniel Korat, and Ronen Laperdon
Word Count
3,527
Language
-
Hacker News Points
-
Summary

Portfolio Beam Search (PBS) is an innovative test-time method that applies financial portfolio theory to large language model (LLM) inference, aiming to optimize compute budget by diversifying candidate solutions similar to financial assets in a portfolio. Unlike traditional methods that greedily select the highest-scoring paths, PBS evaluates candidate solutions based on their risk-adjusted potential, thereby avoiding reasoning ruts and enhancing accuracy and reasoning robustness. This method represents a shift in AI scaling from pretraining reliance to test-time compute scaling, where the processing time during the inference phase is strategically increased to address complex tasks. By framing decoding as an optimization problem, PBS balances expected output quality against model uncertainty and semantic diversity, enhancing exploration and exploitation. Evaluations on the MATH-500 benchmark have demonstrated that PBS significantly improves sample efficiency and compute budget utilization, allowing smaller models to achieve accuracy levels comparable to much larger architectures. This advancement opens new possibilities for scaling test-time compute in diverse domains, with ongoing research exploring the limits of this approach.