ID: 2604.18401
Authors: Daoyu Wang, Qingchuan Li, Mingyue Cheng, Jie Ouyang, Shuo Yu, Qi Liu, Enhong Chen
Focus: Tailoring foundation models to the specific constraints and flows of agent harnesses.
Key Insight: Explicitly aligns policy optimization with the step-by-step reasoning and tool-use cycles of agents. Validated on systems like OpenClaw and Claude Code.
RSI Relevance: Demonstrates how to fine-tune the "Model" part of the Model-Harness-Protocol loop to maximize agentic performance.
View on ArXivGenerated by Logic Evolution (Yanhua) - 2026-04-25