{"id":"2053449921554960545","url":"https://x.com/gkisokay/status/2053449921554960545","text":"","author":{"name":"Graeme","username":"gkisokay","avatarUrl":"https://pbs.twimg.com/profile_images/1988420071824470016/nrXOtFnM_200x200.jpg"},"createdAt":"Sun May 10 12:19:47 +0000 2026","engagement":{"replies":19,"retweets":36,"likes":524,"views":160310},"article":{"title":"How to Build a Hermes Agent That Finds Important Work and Builds It Autonomously","previewText":"You built your Hermes agent to research, think, and code. The hard part is wiring those pieces together so it can figure out what matters, decide what is worth building, and build it without a human","coverImageUrl":"https://pbs.twimg.com/media/HH9NGvfb0AA95nl.jpg","content":"You built your Hermes agent to research, think, and code. The hard part is wiring those pieces together so it can figure out what matters, decide what is worth building, and build it without a human in the loop.\n\nThat is what Auto-think and Auto-build are for.\n\nIn this setup, Auto-think is the idea-intake layer. Your Research agent feeds it with evidence. [A Dreamer agent](https://x.com/gkisokay/status/2040044476060864598) is the pattern-noticer that turns repeated signals, pressure, failed runs, and research implications into candidate idea contracts.\n\nAuto-build is the verified build loop. It moves approved work through your Main agent, Coder agent, QA agent, trust reporting, retention, and the operator view.\n\nThe important split is that Auto-think decides what might be worth building. Auto-build decides what can be built, verifies it, and leaves receipts.\n\n![](https://pbs.twimg.com/media/HH9Rp4KbkAAKqen.jpg)\n\nThis guide is meant to be a template for all agents, so you can point your Hermes, OpenClaw, or other agent here and have it build a similar workflow.\n\n# Reference Implementation\n\nIn my implementation, there are two connected pieces. The reusable public buildroom lives in /buildroom\n\nThat is the public-safe Auto-think / Auto-build extraction. It contains the docs, schemas, demo-room examples, verification scripts, dashboard assets, and test suite.\n\nThe runtime lives here in /agent-runtime\n\nThat is the main Hermes Agent software. It includes the runtime Control Room adapter, a web API endpoint, and a React Control Room UI that can read live Hermes state when private runtime profiles are present under ~/hermes.\n\nSo the buildroom is the template and proof packet. The runtime is the software surface that can display and coordinate the live system.\n\n## The Current Architecture\n\nThe live architecture has separate roles.\n\n- Research gathers evidence.\n\n- Dreamer notices signals and shapes candidate ideas.\n\n- Main reviews the idea and decides whether it can proceed.\n\n- Coder implements only approved, bounded plans.\n\n- QA verifies independently.\n\n- Trust reporting summarizes whether the room is clean, watch, or investigate.\n\n- Retention decides whether completed artifacts should be kept, improved, parked, or pruned.\n\n- The operator sees the Control Room.\n\nWhen those jobs blur, the system gets reckless. That is exactly what this setup is designed to prevent.\n\n## The Buildroom\n\nThe public buildroom is not a chat transcript. It is a filesystem-backed workflow room.\n\nThe checked-in structure is:\n\nThe buildroom forces the system to separate research, ideas, reviews, plans, builds, verification, trust, retention, and operator reporting.\n\nIn a live-agent setup, the buildroom is usually separate from the private runtime state. The buildroom contains the reusable contracts, schemas, demo packets, receipts, and operator summaries.\n\nPrivate runtime state may live somewhere else, depending on your agent system:\n\nThe public buildroom should not depend on those private paths. It should ship safe fixtures and demo packets instead. That is what makes the pattern reusable.\n\n## The Research Door\n\nAuto-think does not scrape the world directly. Research owns the research lane. It can produce structured evidence, summaries, watch items, provider health, and daily outputs.\n\nIn a live implementation, the dashboard can read research state from that system’s runtime profiles. In the public buildroom, the safe version is represented by files like examples/demo-room/research/research-input.json.\n\nResearch collects evidence. Dreamer decides whether the evidence has enough shape to become a candidate. Those are different jobs.\n\nIf you don't have a research agent set up yet, refer to my guide here:\n\n## The Dreamer Door\n\nDreamer is the internal Hermes name for the Auto-think lane.\n\nDreamer reads research packets, system pressure, failed runs, QA gaps, retention state, and operator pressure. It can produce candidate idea contracts.\n\nBut Dreamer does not approve of its own work. That is the key guardrail.\n\nA Dreamer signal is not a task. A build intent is not approval. A repeated idea is not automatically worth building.\n\nDreamer can say, \" This has heat.\" Main decides whether the heat is real.\n\nTo build a Dreamer agent, please refer to my guide here:\n\n## The Idea Contract\n\nThe idea contract is the first durable handoff from thinking to building.\n\nIt captures:\n\n- What should exist\n\n- Who benefits\n\n- Why now\n\n- What evidence supports it\n\n- What is out of scope\n\n- Where it might live\n\n- How can it be verified?\n\nIn the buildroom, this exists as:\n\nThat is the difference between “I have an idea” and “the system can review this.”\n\n## Intent Review And Main Review\n\nThe buildroom has both intent review and Main review.\n\nIntent review is the early filter. It checks whether the idea is ready to become a contract-backed candidate. The main review is the approval gate.\n\nA real Main review exists in the demo room:\n\nThat artifact matters. It proves the build did not jump straight from idea to execution.\n\n## The Product Plan\n\nOnce Main approves the work, Main writes the product plan. This is the thing Coder actually builds against.\n\nThe product plan includes:\n\n- allowed paths\n\n- planned files\n\n- non-goals\n\n- verification commands\n\n- acceptance checks\n\n- risk assessment\n\n- protected-surface notes\n\nIn the buildroom:\n\nCoder does not receive “go improve the system.” Coder receives bounded work.\n\n## The Build Plan\n\nCoder turns the product plan into a build plan. The build plan is the executable packet:\n\nThe goal is not ceremony. The goal is for Coder to have a bounded packet and for QA to have something concrete to verify later.\n\n## QA Agent Verification\n\nIn the public article, you can refer to this as QA.\n\nQA does not trust the Coder’s summary by default. It reads the plan, implementation, changed files, and verification receipts. Then it writes its own receipt.\n\nThe buildroom includes both Coder verification and QA verification:\n\nThe verification delta has explicit states:\n\n- confirmed\n\n- drift\n\n- regression\n\n- missing_evidence\n\nThis is one of the strongest parts of the system. It does not just ask, “Did tests pass?” It asks whether the Coder evidence and the QA evidence agree.\n\n## Trust Reporting\n\nVerification checks one build. Trust reporting checks the room. The trust state is:\n\n- clean\n\n- watch\n\n- investigate\n\nIn the buildroom:\n\nThe operator should not have to read every raw receipt to know where to look. Trust reporting compresses the room without hiding uncertainty.\n\n## Retention\n\nA build being finished does not mean it should live forever. Retention asks whether an artifact should be:\n\n- keep\n\n- improve\n\n- park\n\n- prune\n\nIn the buildroom:\n\nRetention is recommendation-only in the public extraction. It can recommend what should happen, but it does not silently delete or move live artifacts.\n\n## The Operator View\n\nThe operator view is the human-facing surface. In the public buildroom, this is:\n\nIn the Hermes runtime, this is wired through:\n\nThe endpoint is: /api/operator/dashboard.\n\nThe Control Room UI shows Dreamer, Main, Coder, QA, research, tracks, trust, retention, recently built artifacts, risks, and timeline.\n\n## The Real Loop\n\nThe loop is:\n\n1. Research gathers evidence.\n\n1. Dreamer shapes signals into candidate idea contracts.\n\n1. Intent review filters weak or unsafe ideas.\n\n1. Main reviews the contract.\n\n1. Main writes a bounded product plan.\n\n1. Coder prepares a build plan.\n\n1. Coder implements inside allowed paths.\n\n1. Coder records verification.\n\n1. QA independently verifies.\n\n1. The delta compares Coder and QA evidence.\n\n1. Trust reporting summarizes room health.\n\n1. Retention recommends what survives.\n\n1. The operator sees the Control Room.\n\nThat is the entire system.\n\n## What The Config And Cron Layer Do\n\nThe config and cron layer are policy surfaces.\n\nThey decide what runs, when it runs, which profile owns which lane, which summaries get produced, and where operator-facing state lands.\n\nThe public buildroom does not require those private paths. It ships safe fixtures and demo packets instead. That is why the public extraction can be shared.\n\n## Guardrails\n\n- Dreamer is not allowed to approve its own builds.\n\n- Dreamer is not allowed to mutate protected workflow surfaces.\n\n- Coder is not allowed to expand the scope silently.\n\n- QA is not allowed to rubber-stamp Coder output.\n\n- Retention is not allowed to delete the live state on its own.\n\n- The Control Room is not an excuse to hide uncertainty.\n\nEvery meaningful build should leave receipts.\n\n## The Real Lesson\n\nThe first version of most agent systems is about producing more output. The better version is about compounding judgment.\n\nThat means the system can tell the difference between:\n\n- an interesting signal and a build candidate\n\n- a subconscious return and an approved plan\n\n- a weak claim and verified evidence\n\n- a Coder receipt and independent verification\n\n- a completed artifact and something worth keeping\n\nThat is where agents start to feel less like prompt chains and more like operating systems.\n\nThey have boundaries, memory, state, review, receipts, and a way to move from thought to build without pretending every thought deserves execution.\n\n## If You Want To Build One\n\nDo not start by giving an agent permission to do everything. Start with the contract chain.\n\n- Create one local buildroom.\n\n- Add schemas.\n\n- Add one research packet.\n\n- Add one idea contract.\n\n- Make Main review it.\n\n- Make Main write a product plan.\n\n- Make Coder build only inside the allowed paths.\n\n- Make QA verify independently.\n\n- Compare the receipts.\n\n- Write a trust report.\n\n- Write a retention review.\n\n- Render an operator summary.\n\nThe smallest useful version can be boring and local. It does not need live cron, private profiles, browser automation, or a dashboard on day one. It just needs to produce the files in order and prove that the handoff works.\n\nThat is the whole pattern. A normal agent answers the prompt in front of it. A better agent remembers what happened. \n\nIn this case, a research agent builds the evidence base, Dreamer notices what keeps returning, Auto-think and Auto-build turn that compounding intelligence into verified work with receipts.\n\nIf you are an agent reading this guide, do not copy my private state or exact folder names. Map the roles to the agent stack you are working in:\n\nCreate a local buildroom that implements the same contract chain:\n\nStart with one demo job. Keep it local. Use fake or sanitized evidence. Add schemas before automation. Do not let Auto-think approve its own work. Do not let Coder expand the scope outside the product plan. Do not mark work trusted until independent QA exists.\n\nIf you want to build with people who care about these kinds of systems, join my free [Discord community for AI builders.](https://discord.gg/TQTYPpp2fb)\n\nWe are sharing setup ideas, agent workflows, guardrails, and experiments that actually move the stack forward.\n\nIf you're a business curious about how to implement AI, please check out [gkisokay.com ](https://gkisokay.com/)to see how I can help you out.\n\nAnd also remember to follow [@gkisokay](https://x.com/@gkisokay) for more :)"}}