How to Build a Hermes Agent That Finds Important Work and Builds It Autonomously

You built your Hermes agent to research, think, and code. The hard part is wiring those pieces together so it can figure out what matters, decide what is worth building, and build it without a human in the loop.
That is what Auto-think and Auto-build are for.
In this setup, Auto-think is the idea-intake layer. Your Research agent feeds it with evidence. A Dreamer agent is the pattern-noticer that turns repeated signals, pressure, failed runs, and research implications into candidate idea contracts.
Auto-build is the verified build loop. It moves approved work through your Main agent, Coder agent, QA agent, trust reporting, retention, and the operator view.
The important split is that Auto-think decides what might be worth building. Auto-build decides what can be built, verifies it, and leaves receipts.

This guide is meant to be a template for all agents, so you can point your Hermes, OpenClaw, or other agent here and have it build a similar workflow.
Reference Implementation
In my implementation, there are two connected pieces. The reusable public buildroom lives in /buildroom
That is the public-safe Auto-think / Auto-build extraction. It contains the docs, schemas, demo-room examples, verification scripts, dashboard assets, and test suite.
The runtime lives here in /agent-runtime
That is the main Hermes Agent software. It includes the runtime Control Room adapter, a web API endpoint, and a React Control Room UI that can read live Hermes state when private runtime profiles are present under ~/hermes.
So the buildroom is the template and proof packet. The runtime is the software surface that can display and coordinate the live system.
The Current Architecture
The live architecture has separate roles.
When those jobs blur, the system gets reckless. That is exactly what this setup is designed to prevent.
The Buildroom
The public buildroom is not a chat transcript. It is a filesystem-backed workflow room.
The checked-in structure is:
The buildroom forces the system to separate research, ideas, reviews, plans, builds, verification, trust, retention, and operator reporting.
In a live-agent setup, the buildroom is usually separate from the private runtime state. The buildroom contains the reusable contracts, schemas, demo packets, receipts, and operator summaries.
Private runtime state may live somewhere else, depending on your agent system:
The public buildroom should not depend on those private paths. It should ship safe fixtures and demo packets instead. That is what makes the pattern reusable.
The Research Door
Auto-think does not scrape the world directly. Research owns the research lane. It can produce structured evidence, summaries, watch items, provider health, and daily outputs.
In a live implementation, the dashboard can read research state from that system’s runtime profiles. In the public buildroom, the safe version is represented by files like examples/demo-room/research/research-input.json.
Research collects evidence. Dreamer decides whether the evidence has enough shape to become a candidate. Those are different jobs.
If you don't have a research agent set up yet, refer to my guide here:
The Dreamer Door
Dreamer is the internal Hermes name for the Auto-think lane.
Dreamer reads research packets, system pressure, failed runs, QA gaps, retention state, and operator pressure. It can produce candidate idea contracts.
But Dreamer does not approve of its own work. That is the key guardrail.
A Dreamer signal is not a task. A build intent is not approval. A repeated idea is not automatically worth building.
Dreamer can say, " This has heat." Main decides whether the heat is real.
To build a Dreamer agent, please refer to my guide here:
The Idea Contract
The idea contract is the first durable handoff from thinking to building.
It captures:
In the buildroom, this exists as:
That is the difference between “I have an idea” and “the system can review this.”
Intent Review And Main Review
The buildroom has both intent review and Main review.
Intent review is the early filter. It checks whether the idea is ready to become a contract-backed candidate. The main review is the approval gate.
A real Main review exists in the demo room:
That artifact matters. It proves the build did not jump straight from idea to execution.
The Product Plan
Once Main approves the work, Main writes the product plan. This is the thing Coder actually builds against.
The product plan includes:
In the buildroom:
Coder does not receive “go improve the system.” Coder receives bounded work.
The Build Plan
Coder turns the product plan into a build plan. The build plan is the executable packet:
The goal is not ceremony. The goal is for Coder to have a bounded packet and for QA to have something concrete to verify later.
QA Agent Verification
In the public article, you can refer to this as QA.
QA does not trust the Coder’s summary by default. It reads the plan, implementation, changed files, and verification receipts. Then it writes its own receipt.
The buildroom includes both Coder verification and QA verification:
The verification delta has explicit states:
This is one of the strongest parts of the system. It does not just ask, “Did tests pass?” It asks whether the Coder evidence and the QA evidence agree.
Trust Reporting
Verification checks one build. Trust reporting checks the room. The trust state is:
In the buildroom:
The operator should not have to read every raw receipt to know where to look. Trust reporting compresses the room without hiding uncertainty.
Retention
A build being finished does not mean it should live forever. Retention asks whether an artifact should be:
In the buildroom:
Retention is recommendation-only in the public extraction. It can recommend what should happen, but it does not silently delete or move live artifacts.
The Operator View
The operator view is the human-facing surface. In the public buildroom, this is:
In the Hermes runtime, this is wired through:
The endpoint is: /api/operator/dashboard.
The Control Room UI shows Dreamer, Main, Coder, QA, research, tracks, trust, retention, recently built artifacts, risks, and timeline.
The Real Loop
The loop is:
That is the entire system.
What The Config And Cron Layer Do
The config and cron layer are policy surfaces.
They decide what runs, when it runs, which profile owns which lane, which summaries get produced, and where operator-facing state lands.
The public buildroom does not require those private paths. It ships safe fixtures and demo packets instead. That is why the public extraction can be shared.
Guardrails
Every meaningful build should leave receipts.
The Real Lesson
The first version of most agent systems is about producing more output. The better version is about compounding judgment.
That means the system can tell the difference between:
That is where agents start to feel less like prompt chains and more like operating systems.
They have boundaries, memory, state, review, receipts, and a way to move from thought to build without pretending every thought deserves execution.
If You Want To Build One
Do not start by giving an agent permission to do everything. Start with the contract chain.
The smallest useful version can be boring and local. It does not need live cron, private profiles, browser automation, or a dashboard on day one. It just needs to produce the files in order and prove that the handoff works.
That is the whole pattern. A normal agent answers the prompt in front of it. A better agent remembers what happened.
In this case, a research agent builds the evidence base, Dreamer notices what keeps returning, Auto-think and Auto-build turn that compounding intelligence into verified work with receipts.
If you are an agent reading this guide, do not copy my private state or exact folder names. Map the roles to the agent stack you are working in:
Create a local buildroom that implements the same contract chain:
Start with one demo job. Keep it local. Use fake or sanitized evidence. Add schemas before automation. Do not let Auto-think approve its own work. Do not let Coder expand the scope outside the product plan. Do not mark work trusted until independent QA exists.
If you want to build with people who care about these kinds of systems, join my free Discord community for AI builders.
We are sharing setup ideas, agent workflows, guardrails, and experiments that actually move the stack forward.
If you're a business curious about how to implement AI, please check out gkisokay.com to see how I can help you out.
And also remember to follow @gkisokay for more :)

