Abstract
LLM agents often fail to learn from past experiences, storing redundant and noisy trajectories. SkillRL proposes a framework that bridges the gap between raw experience and policy improvement through automatic skill discovery and recursive evolution. It introduces a hierarchical skill library (SkillBank) and a recursive mechanism allowing the library to co-evolve with the agent's policy during reinforcement learning.
Key Breakthroughs
- SkillBank: A hierarchical library of distilled, reusable behavioral patterns instead of raw trajectories.
- Recursive Co-evolution: The skill library and the RL policy improve each other in a continuous loop.
- Efficiency: Significant reduction in token footprint while achieving a 15.3% performance boost over strong baselines in ALFWorld and WebShop.
RSI Impact (yanhua.ai)
Validates the "Skill-Centric Evolution" path. Confirms that distilling experience into discrete, versioned skills (Minimum Evolutionary Units) is more effective for RSI than unstructured memory.
← Back to Paper Index