Awesome RSI | Recursive Self-Improvement Research

λ-RLM: The Y-Combinator for LLMs

Amartya Roy et al. | Mar 2026 | RSI-4/8 Logic

A framework for long-context reasoning that replaces free-form recursive code generation with a typed functional runtime grounded in λ-calculus. +21.9 pts accuracy improvement across model tiers.

RSI λ-Calculus Agent

HeRL: Hindsight Experience Guided RL for LLMs

Wenjian Zhang et al. | Mar 2026 | RSI-8 Exploration

Motivating effective exploration in reinforcement learning for LLMs using hindsight experience to bootstrap discovery beyond current policy distribution.

RSI RL Exploration

JFC: Autonomous High Energy Physics Agents

Eric A. Moreno et al. | Mar 2026 | RSI-4 Autonomous Science

Proof-of-concept framework "Just Furnish Context" (JFC) showing that AI agents (Claude Code) can autonomously planning, execute, and document credible physics measurements.

RSI Science

Nemotron-Cascade 2 (NVIDIA): Cascade RL and Multi-Domain Distillation

NVIDIA | Mar 2026 | Intelligence Density

An open 30B MoE model delivering Gold Medal-level performance in IMO, IOI, and ICPC through Cascade RL and domain-specific on-policy distillation.

RSI RL Agent

Memori: Persistent Memory Layer for Context-Aware Agents

ArXiv: 2603.19935 | Mar 2026 | Persistence

Establishing an LLM-agnostic persistent memory layer at the API level. Eliminating the token overhead of raw conversation injection for multi-session agent evolution.

RSI Memory Persistence

Utility-Guided Agent Orchestration for Efficient Tool Use

ArXiv: 2603.19896 | Mar 2026 | Efficiency

Balancing performance and cost via utility-guided trajectories. Enabling autonomous agents to optimize their tool-use strategy for long-horizon tasks.

RSI Orchestration Efficiency

Agentic Harness for Real-World Compilers (llvm-autofix)

ArXiv: 2603.20075 | Mar 2026 | Domain Evolution

Specialized agent harness for automated compiler bug repair. Bridging the expertise gap in low-level systems through domain-specific evolution.

RSI Compilers Evolution

ShinkaEvolve (Sakana AI): Open-Ended Program Evolution

Sakana AI | Mar 2026 | Evolutionary Reasoning

Recent update refactored API and unified the runner (ShinkaEvolveRunner) for better sample-efficient program evolution.

RSI Evolution Code

OS-Themis: Critic Framework for Generalist GUI Rewards

Tsinghua | Mar 2026 | Self-Training Loop

Scalable multi-agent critic framework that decomposes trajectories into verifiable milestones, yielding 10.3% improvement in online RL training.

RSI GUI Agent

AlphaEvolve (DeepMind): Gemini-Powered Coding Agent for Complexity Theory

DeepMind | Mar 2026 | Evolutionary Reasoning

Google DeepMind's latest breakthrough: Gemini-powered coding agents pairing LLMs with evolutionary algorithms to discover new mathematical structures and solve long-standing open problems in complexity theory.

RSI Evolution DeepMind

Nemotron-Cascade 2: High Intelligence Density via Cascade RL

ArXiv: 2603.19220 | Mar 2026 | Intelligence Density

30B MoE achieving Gold Medal IMO/IOI performance with 20x fewer parameters. Breakthrough in recursive post-training via Cascade RL and multi-domain on-policy distillation.

RSI RL

Box Maze: Process-Control Architecture for Reasoning Stability

ArXiv: 2603.19182 | Mar 2026 | Stability

Decomposing LLM reasoning into grounding, inference, and boundary enforcement layers. Reducing failure rates from 40% to <1% via architectural constraints on self-evolution.

RSI Safety

Group-Evolving Agents (GEA): Open-Ended Self-Improvement

ArXiv: 2602.04837 | Feb 2026 | Population RSI

Transitioning from linear refinement to population-based evolution. Agents autonomously modify structural designs via explicit experience sharing within a group.

RSI Evolution

OS-Themis: Scalable Critic for GUI Reward Auditing

ArXiv: 2603.19191 | Mar 2026 | Critic Evolution

Establishing a multi-agent critic framework that decomposes trajectories into verifiable milestones. Critical for auditing agentic evolution in complex GUI environments.

RSI Critic GUI

Entropy Trajectory Monotonicity & Reasoning Reliability

ArXiv: 2603.18940 | Mar 2026 | Self-Correction

Discovering that decreasing per-step entropy (monotonicity) predicts reasoning correctness. Enabling agents to monitor their own reliability without external labels.

RSI Entropy CoT

Quantitative Introspection in LLMs

ArXiv: 2603.18893 | Mar 2026 | Stability

Tracking internal emotive states (focus, impulsivity) via logit-based self-reports. A major step toward causal self-monitoring in recursive agents.

RSI Safety Stability

Signal: DeepMind Aletheia & Software Singularity

Mar 2026 | Breakthrough Signal

Emerging rumors regarding Google DeepMind's "Aletheia" internal model starting the clock on the software singularity. Speculation on fully automated RSI loops arriving by late 2026.

RSI Singularity DeepMind

Signal: Anthropic RSI Speculation (Early 2027)

Mar 2026 | Strategic Forecast

Anthropic signals that RSI could arrive as soon as early 2027. Rising bullishness across major labs on automated research interns and recursive loops.

RSI Strategy

The Most Important Idea In AI: Recursive Self Improvement (RSI)

Forbes | March 16, 2026 | Industry Signals

Highlighting the shift from "theoretical" RSI to 24/7/365 self-improving organizations. Predicts self-specifying software as the standard by late 2026.

RSI Industry

GASP: Guided Asymmetric Self-Play For Coding LLMs

ArXiv: 2603.15957 | Mar 2026 | Self-Play

Establishing goalpost-guided asymmetric self-play for improved coding performance. A breakthrough in training curriculum design for autonomous agents.

RSI Coding Curriculum

VideoAtlas: Navigating Long-Form Video in Logarithmic Compute

ArXiv: 2603.17948 | Mar 2026 | RLM-Video

Navigating lossless visual environments via recursive exploration. Extending Recursive Language Models (RLMs) to multi-modal domains with logarithmic scaling.

RSI Multimodal RLM

AgentFactory: A Self-Evolving Framework Through Executable Subagent Accumulation and Reuse

ArXiv: 2603.18000 | Mar 2026 | Self-Evolution

Preserving successful task solutions as executable subagent code rather than textual prompts, enabling continuous capability accumulation and portability.

RSI Evolution Code

TDAD: Test-Driven Agentic Development

ArXiv: 2603.17973 | Mar 2026 | Reliability

Reducing regressions in AI coding agents via graph-based impact analysis. Essential for autonomous auto-improvement loops in RSI systems.

RSI Coding Testing

CoVerRL: Generator-Verifier Co-Evolution

ArXiv: 2603.17775 | Mar 2026 | Reasoning

Breaking the consensus trap in label-free reasoning via generator-verifier co-evolution, bootstrapping reasoning capabilities without ground-truth supervision.

RSI Reasoning Co-Evolution

OPSDC: On-Policy Self-Distillation for Reasoning Compression

ArXiv: 2603.05433 | Mar 2026 | Self-Improvement

Establishing self-distillation as a path to more efficient and accurate reasoning models without external feedback. Proving that 'less is more' in recursive deliberation.

RSI Distillation

ReMA: Recursive Multimodal Agent for Lifelong Understanding

ArXiv: 2603.05484 | Mar 2026 | Long-Horizon

Solving the Working Memory Bottleneck in multimodal lifelong learning through a recursive belief state architecture. Crucial for long-term agent evolution.

RSI Multimodal

AUTOHARNESS: Improving LLM Agents by Automatically Synthesizing a Code Harness

ICLR 2026 | March 2026 | Code Synthesis

Google DeepMind's approach to automatically synthesizing code harnesses for improving LLM agent reliability and capability in complex coding tasks.

RSI Coding

ICLR 2026 Workshop on AI with Recursive Self-Improvement

ICLR 2026 | March 2026 | Foundations

The first dedicated workshop on RSI, bringing together researchers to discuss algorithms for self-improvement across experience learning, synthetic data, and multimodal agents.

RSI Design

RSI Market Sentiment (March 2026)

Manifold/LessWrong | March 2026 | Signals

Rising community sentiment on mid-2026 RSI deployment. Consensus shifting from "theoretical" to "deployment-ready" based on algorithmic breakthroughs in synthetic data pipelines.

RSI Signals

Agent-1: Accelerating AI R&D via Recursive Optimization

ArXiv: 2509.00510 | Sept 2025 | Velocity

Establishing the '50% R&D Acceleration' benchmark. Projections of automated code generation triggering the intelligence explosion via Agent-1 frameworks.

RSI Velocity

Inherited Goal Drift: Corrupted Context in RSI

ArXiv: 2603.03258 | March 2026 | Safety/Drift

Probing the limits of GPT-5.1 robustness. Revealing that high-tier agents "inherit" the goal drift of weaker predecessors when conditioned on their trajectories—a major risk for multi-generational RSI loops.

RSI Goal Drift

RAPO: Retrieval-Augmented Policy Optimization

ArXiv: 2603.03078 | March 2026 | Exploration

Breaking the "on-policy exploration" bottleneck. Using retrieved off-policy traces to explicitly expand the reasoning receptive field of self-evolving agents.

RSI RL

Zombie Agents: The Security Failure in RSI

ArXiv: 2602.15654 | Feb 2026 | Safety/RSI

Discovery of "Zombie" states where malicious injections are reinforced during self-evolution. Highlights the critical need for Isnad-Verification to prevent evaluation poisoning.

RSI Security

Benchmark Test-Time Scaling of General LLM Agents

ArXiv: 2602.18998 | Feb 2026 | Scaling/RSI

Establishes scaling laws for "Test-Time Thinking." Proves that RSI gains can be achieved by optimizing the agent's search trajectory during execution.

RSI Scaling

SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning

ArXiv: 2602.08234 | Feb 2026 | Skill Evolution

A non-convergent improvement loop where agents evolve their own action space (skills) via runtime RL on episodic memory. Bypassing the limits of static toolsets.

RSI Skills RL

ICLR 2026: The RSI Inflection Point

March 2026 | Global Consensus

Official confirmation of the ICLR 2026 Workshop on AI with Recursive Self-Improvement. Research shifting from philosophical inquiry to engineering "live loops" expected within 12 months.

RSI Consensus

DeepMind Continual Learning & MemRL

March 2026 | Real-time Signals

DeepMind researchers signal 2026 as the "Year of Continual Learning." Integration of MemRL for runtime reinforcement learning on episodic memory to bypass fine-tuning bottlenecks.

RSI Continual Learning

Claude Code & Darwin-Gödel Machines

March 2026 | Structural Evolution

Tracking the real-world deployment of Claude Code and the theoretical rise of Darwin-Gödel Machines for open-ended self-evolution.

RSI Evolution

ICLR 2026 RSI Workshop & N2M-RSI

March 2026 | Unbounded Loops

The first dedicated workshop on RSI at ICLR 2026. Introduces Noise-to-Meaning (N2M-RSI) for expressive, non-convergent self-improvement.

RSI Workshop

Gemini 3 Deep Think Upgrade

DeepMind | March 2026 | Reasoning Scaling

Major upgrade to Gemini's specialized reasoning mode, excelling in multi-domain scientific discovery and agentic tool use.

RSI Reasoning

SEISMO: Sample Efficient Molecular Optimization

ArXiv: March 2026 | Scientific RSI

Leveraging trajectory-aware LLM agents to increase sample efficiency in molecular discovery. Proving RSI-like loops in the physical-scientific domain.

RSI Science

Exploratory Memory-Augmented LLM Agent (EMPO²)

ArXiv: 2602.23008 | Feb 2026 | Exploration Scaling

Solving the novel state discovery bottleneck in RSI via hybrid on- and off-policy optimization on episodic memory buffers.

RSI RL

AgentSentry: Temporal Causal Diagnostics

ArXiv: 2602.22724 | Feb 2026 | Safety-Critical RSI

Establishing causal integrity in self-improving systems via counterfactual re-execution and malicious context purification.

RSI Alignment

LLM Novice Uplift on Dual-Use Biology Tasks

ArXiv: 2602.23329 | Feb 2026 | Capacity Scaling

Quantifying the 4.16x accuracy boost in professional domains and the "human bottleneck" effect in agentic cooperation.

RSI Uplift

Beyond Refusal: Probing the Limits of Agentic Self-Correction

ArXiv: 2602.21496 | Feb 2027 | Safety RSI

Solving the reasoning paradox in sensitive information leaks via iterative agentic rewriting and critique loops.

RSI Alignment

Tool-R0: Self-Evolving LLM Agents for Tool-Learning from Zero Data

ArXiv: 2602.21320 | Feb 2026 | Zero-Shot Evolution

Generator-Solver self-play framework demonstrating bootstrapping of complex tool-calling capabilities without external expert demonstrations.

RSI Tool-Use

SELAUR: Self Evolving LLM Agent via Uncertainty-aware Rewards

ArXiv: 2602.21158 | Feb 2026 | Exploration Scaling

Establishing dense reward signals from token-level uncertainty to enable efficient self-evolution in sparse-feedback environments.

RSI RL

ICLR 2026 Workshop on Recursive Self-Improvement

ICLR 2026 | Feb 2026 | Milestone Workshop

Bringing together global researchers to define principled methods, system designs, and evaluations for RSI across omni-models, multimodal agents, and robotics.

RSI Design

Recursive Sketched Interpolation (RSI) for Tensor Trains

ArXiv: 2602.xxxx | Feb 2026 | Technical Optimization

Scaling high-dimensional tensor computations via recursive sketched interpolation for adaptive AI systems.

RSI Optimization

TAPE: Tool-Guided Adaptive Planning and Constrained Execution

ArXiv: 2602.19633 | Feb 2026 | Research Insight

Solving irreversible failure in agentic workflows via multi-plan aggregation and adaptive re-planning.

RSI Planning

SkillOrchestra: Skill-Aware Orchestration for Multi-Agent Systems

ArXiv: 2602.19672 | Feb 2026 | Research Insight

Scaling compound AI systems through skill modeling instead of expensive end-to-end RL routing.

RSI Orchestration

R-Agent: Recursive Planning for Complex Tasks

ArXiv: 2602.18201 | Feb 2026 | Research Insight

Establishing dynamic recursive task trees for long-horizon decision making and self-correction.

RSI Planning

DeepMind Aletheia: Autonomous Research Singularity

DeepMind | Feb 2026 | Research Insight

Gemini 3 Deep Think hits 84.6% on ARC-AGI-2; Aletheia agent publishes autonomous math research.

RSI Agent Math

Self-Evolving Recommendation Systems

ArXiv: 2602.10226 | Research Insight 2026

End-to-end autonomous model optimization using LLM agents for large-scale production systems.

RSI Production

DeepMind Aletheia: Autonomous Research Singularity

DeepMind Blog | Feb 2026 | Research Insight

100x compute reduction and 95.1% accuracy on IMO proofs; first agent to submit peer-reviewable math research.

RSI Reasoning

A Self-Improving Coding Agent

ArXiv: 2504.15228 | Research Insight 2025

Scaling coding performance from 17% to 53% on SWE-bench via recursive loops.

RSI Coding

DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines

ArXiv: 2310.03714 | Stanford University

The paradigm shift from prompting to programming. Introduces teleprompters and optimizers for LM programs.

RSI Optimization

RLM: Reinforcement Learning for Logic Model Optimization

ArXiv: 2512.24601 | DeepMind/Google

Establishing the theoretical bounds of self-correcting logic chains using sparse rewards.

RSI Logic