RESEARCH · February 2026

RuleReasoner: Enhancing Logical Reasoning in Agents via Explicit Rule Guidance

The TongAgents team proposes RuleReasoner, a framework that significantly improves agent performance on complex logical reasoning tasks through explicit rule guidance

ICLR 2026 arXiv Code (GitHub) Dataset (Hugging Face)

In an era of rapid advancement in large language models (LLMs), ensuring reliable and interpretable performance on complex logical reasoning tasks remains a central challenge for both the research community and industry practitioners.

The TongAgents team at the Beijing Institute for General Artificial Intelligence (BIGAI) introduces an innovative reasoning framework — RuleReasoner — which significantly enhances the accuracy and interpretability of LLMs on logical reasoning tasks through an explicit rule-guidance mechanism.

Background

Logical Reasoning: A Critical Challenge for Large Language Models

Despite remarkable progress in natural language understanding and generation, LLMs frequently exhibit instability and unreliability on tasks demanding rigorous logical reasoning. While conventional prompt engineering approaches can partially improve reasoning capabilities, they lack a systematic rule-guidance mechanism.

The core innovation of RuleReasoner lies in explicitly integrating expert-crafted reasoning rules into the model's inference process, enabling the model to follow well-defined logical rules in a step-by-step manner — much as a human expert would.

Method

The RuleReasoner Framework

RuleReasoner employs a multi-stage reasoning framework comprising the following key components:

Rule Extraction and Formalization

Reasoning rules are extracted from domain expert knowledge and exemplars, then formalized into executable logical expressions to ensure both precision and applicability.

Rule-Guided Reasoning

During inference, the model not only leverages its inherent language understanding capabilities but also actively invokes relevant rules, ensuring that each reasoning step is grounded in explicit logical justification.

Dynamic Rule Selection

Based on the current reasoning state, the most relevant rules are dynamically selected for application, preventing rule redundancy and reasoning path divergence.

Enhanced Interpretability

Each reasoning step is accompanied by explicit rule citations, rendering the entire reasoning process transparent and traceable for human expert review and debugging.

图2 RuleReasoner algorithm: dynamic curriculum learning with domain-aware sampling

Experiments and Evaluation

Significant Improvements Across Multiple Benchmarks

We conducted comprehensive evaluations of RuleReasoner on several established logical reasoning benchmarks, spanning deductive reasoning, inductive reasoning, and commonsense reasoning tasks.

Dataset overview — 图3 Evaluation dataset overview: covering diverse reasoning task types

Benchmarking results — 图4 RuleReasoner performance across multiple benchmarks

Substantial Accuracy Gains

Across multiple logical reasoning benchmarks, RuleReasoner achieves an average accuracy improvement of 15–25% over baseline methods, with particularly pronounced gains on complex multi-step reasoning tasks.

More Interpretable Reasoning

Through explicit rule citations, the reasoning paths generated by RuleReasoner are significantly more transparent, enabling human experts to readily understand and verify the validity of each reasoning step.

Stronger Generalization

In cross-domain and zero-shot settings, RuleReasoner demonstrates superior generalization capabilities, validating the universality and robustness of the rule-guidance mechanism.

Experimental results — 图5 Detailed experimental results: comparative analysis across models and methods

图6 Reasoning dynamics: temporal analysis of rule application

Case Studies

Performance in Real-World Reasoning Scenarios

Through concrete case analyses, we provide an intuitive demonstration of RuleReasoner's advantages in practical reasoning tasks.

Case studies — 图7 Representative case studies: comparison of reasoning processes between RuleReasoner and baseline methods

Additional evaluations — 图8 Additional evaluation dimensions: robustness, efficiency, and scalability analysis

Outlook

Toward More Reliable Reasoning Intelligence

The introduction of RuleReasoner offers a novel paradigm for building more reliable and interpretable reasoning systems. Looking ahead, we will continue to explore the following directions:

🧠

Automatic Rule Learning

Enabling models to automatically learn and distill reasoning rules from data, reducing the need for manual intervention.

🔄

Multimodal Reasoning

Extending the rule-guidance mechanism to multimodal reasoning scenarios encompassing vision, speech, and beyond.

🤝

Human–AI Collaborative Reasoning

Building interactive frameworks for collaborative reasoning between human experts and AI systems, leveraging the complementary strengths of each.

We believe that by combining the rigor of symbolic reasoning with the flexibility of neural networks, RuleReasoner represents a significant step toward truly trustworthy artificial intelligence reasoning systems.

Logical ReasoningRule-based ReasoningLarge Language ModelsCurriculum Learning

Authors

Yang Liu^*1, Jiaqi Li^*1, Zilong Zheng¹

¹ BIGAI TongAgents

^* Core contributors.