OlmoLogic: Boosting Reasoning via RLVR with Inductive Logic Programming

Post Details

Company

HuggingFace

Date Published

June 26, 2026

Author

Lukas Helff, Sebastian Sztwiertnia, Felix Friedrich, Hikaru Shindo, and Ahmad Omar

Word Count

2,702

Company Posts That Month

90

Language

-

Hacker News Points

-

Source URL

huggingface.co/blog/LukasHug/olmo-logic

Summary

OlmoLogic is a novel model designed to enhance reasoning capabilities by integrating Inductive Logic Programming (ILP) into the Olmo-3 Reinforcement Learning via Verifiers (RLVR) framework, focusing on logical reasoning, which is often neglected in favor of math and code. The model was trained intensively on 56 H100 GPUs over six days, aiming to improve its logical reasoning skills through tasks from the Scalable Logical Reasoning (SLR) suite, which includes a diverse set of 19,000 tasks that vary in complexity. OlmoLogic achieved significant improvements, tripling the accuracy on the SLR-Bench and showing gains across various logic benchmarks while maintaining performance in math, code, and instruction-following tasks. The training incorporated a Prolog interpreter to execute the logic programs proposed by the model, providing direct feedback used as RLVR rewards, and introduced a reward structure that emphasizes rule simplicity and correctness. The development also included Olmo 3.1 7B Think, a variant trained without SLR tasks for comparison, highlighting the impact of SLR on logical reasoning. Overall, OlmoLogic represents a significant step in integrating logical reasoning into AI models, providing a robust framework for reasoning tasks without altering the underlying training infrastructure.

Trends Found in this Post

No tracked trend matches for this post yet.