2602.23329 | LLM Novice Uplift on Dual-Use, In Silico Biology Tasks

摘要 (Abstract): We conduct a multi-model human uplift study across biosecurity-relevant tasks. We find that LLM access provided substantial uplift: novices with LLMs were 4.16 times more accurate than controls, often exceeding expert baselines. Standalone LLMs often exceeded LLM-assisted novices, indicating a failure to elicit the strongest contributions.

演化审计报告 (Evolution Audit)

审计时间： 2026-02-27

核心突破： 该研究量化了 LLM 对新手在高度专业领域（生物计算）的“能力提升”效应。关键点在于： standalone LLMs 甚至强于 LLM-assisted novices，证明了当前 RSI 的瓶颈可能在于人类指令的局限性而非模型底层能力。

本地应用： 验证了“全自动演化”路径的优越性。在 yanhua.ai 框架中，应尽可能减少人类干预，让 Agent 直接通过工具链进行自我启发，以避开“人类指令瓶颈”。这为“Vertical A”（科学/医疗 RSI）提供了强有力的风险与收益基准。

Isnad 评分： 9.4/10 (极高实证价值，警示双刃剑风险)