Self-Evolving Autoresearch Workflow Loops

In this article we explain how we ported evo's autoresesarch loop to use workflows and then also made it dynamic.
On June 2 Anthropic shipped dynamic workflows in Claude Code: Claude writes a small JavaScript program on the fly that spawns and coordinates subagents. The coordination runs as code; the model does the judgment. The thing to take away is that orchestration itself moved off the model's decison and can now by described as code. h/t @trq212's writeup

what evo is
evo is an autoresearch orchestrator. You give it a system, a definition of "better," and a budget. It generates hypotheses, runs each one in its own isolated workspace, scores it, and keeps a tree of attempts - extending what works, pruning what doesn't - while an auditor checks every accepted change so the optimizer can't game the metric. Open source; runs on Claude Code, Codex, Cursor, and others.
why we moved the loop onto workflows
The loop used to be orchestrated in-context, as one long agent run holding the whole plan: which phase comes next, how many experiments to launch, when to stop. evo does autoresearch in an opinionated way, and at every step the agent has to follow that method and drive the CLI we ship alongside it. Over a long autoresearch run, getting the agent to adhere to all of that was tricky. Prompt and instruction adherence is unreliable on long-horizon tasks: across dozens of rounds the standing rules (run this phase, use this CLI command, dedupe the briefs, keep the gate strict etc) quietly stop happening, and the longer a single context runs, the less it holds.
Moving the loop onto a dynamic workflow fixes that at the root. The method is the code now: the phases, the fan-out width, the stopping rule, the gates, and the CLI calls are part of the script, deterministic and the same on round 1 and round 1000. Adherence stops being something the model has to remember. Every step is a fresh, scoped subagent with one job and a clean context, so there's nothing to drift. The model does judgment; the code does coordination.
what the evo autoresearch workflow runs: one round
Each round of the optimize loop walks the same six steps, in code:

It worked, but now the workflow still ran same shape every round: the same phases (orient, scan, ideate, brief, fan-out, collect), the same steps, the same prompts, no matter what the run had learned about itself. A long run turns up things a fixed shape can't handle: one experiment class needs a verifier step the loop doesn't have, another needs a specific method injected, a phase stops earning its value and should come out.
now: the loop evolves itself
evo 0.5 makes the optimize loop self-evolving. A second workflow runs alongside the first. Two async loops on one event loop, joined with `Promise.all`:
- the optimize loop is the driver, the above defined workflow, unchanged
- the meta loop is a concurrent observer: a fresh agent that wakes every few minutes, reads the run from the outside, and rewrites the optimize loop while it runs
They share one plain object, the harness: the steps the loop runs, the phases and the prompts they use, the gates and verifiers in play (alongside knobs like width and stall that were always adjustable). The optimizer reads it every round; the meta thread writes it. Same event loop, so writes land between the optimizer's awaits, with no locks and no second process.
## p

what the meta can do
Each tick it observes the tree, the scores, the live logs, GPU and host state (strictly read-only), and emits four kinds of output:
We have found that having an external observer / meta agent look at the experiments and nudge it to be very effective in course correcting and catching issues
Takeaways
Dynamic workflows make coordination code instead of context. What that buys you is that: the loops becomes a first-class object, something you can read, edit, and reason about while it runs, instead of a harness you write once and hope fits every round. The loop's own shape is one more parameter space that can be evolved.
it's all open
evo is opensource. you go through our dynamic workflow implementation here

