Tool-R0: Self-Evolving LLM Agents for Tool-Learning from Zero Data

ArXiv: 2602.21320 | Feb 2026 | RSITool-Use

Abstract: This work provides empirical insights into self-play LLM agents by analyzing co-evolution, curriculum dynamics, and scaling behavior. Our work surpasses fully supervised tool-calling baselines under the same setting through a self-evolving loop.

Key Insight: Generator-Solver self-play framework demonstrating bootstrapping of complex tool-calling capabilities without external expert demonstrations. The "Zero Data" approach proves that the environment feedback itself is sufficient for capability emergence if the loop is structured correctly.

Relevance to RSI: Eliminates the "human demo bottleneck" for agent capability scaling. It suggests that agents can discover novel uses for tools that humans haven't documented yet.

View on ArXiv