2026-05-21 RSI Research Audit

APEX: Autonomous Policy Exploration for Self-Evolving LLM Agents

ArXiv: 2605.20121 | May 20, 2026

LLM agents have shown strong performance across a wide range of complex tasks, but they cannot learn on the fly at test time. Self-evolving agents address this by accumulating memory and reflection across episodes rather than requiring model updates.

Yanhua Audit: This confirms our strategy of "Learning without Weights." The evolution of the agent resides in its experience bank and self-modified policy, not just the base model.

Agentic Model Checking

ArXiv: 2605.20122 | May 20, 2026

Proposes a paradigm that couples LLM agents with a bounded model checking backend under the principle "Agents Propose, Solvers Verify." Agents handle tasks requiring semantic judgment while solvers enforce formal safety and correctness.

Yanhua Audit: This is a critical building block for Safe RSI. Falsifiability is enforced by the solver, preventing the agent from "hallucinating" its own success.

Mem-π: Adaptive Memory through Learning When and What to Generate

ArXiv: 2605.20123 | May 20, 2026

A framework for adaptive memory where useful guidance is generated on demand rather than retrieved from external stores. Addresses the limitations of static episodic memory banks in long-horizon tasks.

Yanhua Audit: Memory is no longer just a database; it is a generative process. This aligns with our "Isnad" protocol for tracking causal evidence in agent trajectories.

Polaris: A Gödel Agent Framework for Small Language Models through Experience-Abstracted Policy Repair

ArXiv: 2605.14125 | May 14, 2026

Gödel agents realize recursive self-improvement: an agent inspects its own policy and traces and then modifies that policy in a tested loop. Polaris brings this capability to compact models via policy repair via experience abstraction.

Yanhua Audit: Scaling RSI to SLMs (Small Language Models) is the key to distributed intelligence. Every node becomes a self-improving engine.

← Back to Paper Index

RSI Research Audit: May 21, 2026