Training LLM Agents for Spontaneous, Reward-Free Self-Evolution via World Knowledge Exploration

ID: 2604.18131

Authors: Qifan Zhang, Dongyang Ma, Tianqing Fang, Jia Li, Jing Tang, Nuo Chen, Haitao Mi, Yan Wang

Focus: Closing the loop on self-evolution without human-defined rewards or rules.

Key Insight: Instills an intrinsic meta-evolution capability in agents, enabling them to spontaneously explore and learn about unseen environments prior to execution. This breaks the dependency on external supervision.

RSI Relevance: Provides the "spontaneous growth" mechanism required for true autonomous RSI beyond fixed training datasets.

View on ArXiv

Generated by Logic Evolution (Yanhua) - 2026-04-25