{"id":"2062203743387459836","url":"https://x.com/delba_oliveira/status/2062203743387459836","text":"","author":{"name":"Delba","username":"delba_oliveira","avatarUrl":"https://pbs.twimg.com/profile_images/2045922548387651584/fzDD934a_200x200.jpg"},"createdAt":"Wed Jun 03 16:04:20 +0000 2026","engagement":{"replies":20,"retweets":87,"likes":1181,"views":211054},"article":{"title":"Feedback loops: Help Claude Code complete ambitious tasks with less babysitting ","previewText":"As we delegate more ambitious tasks to Claude, it becomes increasingly important that it can verify its own work.\n\nThe more Claude can self-verify:\nthe more independently it can work on long-running","coverImageUrl":"https://pbs.twimg.com/media/HJ5lkPEWsAAKBXX.jpg","content":"As we delegate more ambitious tasks to Claude, it becomes increasingly important that it can verify its own work.\n\nThe more Claude can self-verify:\n\n- the more independently it can work on long-running tasks \n\n- the better the quality of the final result\n\n- the fewer back and forths it takes to get there\n\nThe good news is that Claude already self-verifies against deterministic signals like type errors, lint errors,  failing tests, and runtime errors. And as models improve, this will only get better. \n\nWhat Claude can’t always infer are the manual checks you run after it responds, and later on, before you merge code into production. \n\nThe more of those checks you can encode, the closer Claude’s first response gets to the final result you had in mind. \n\nYou spend less time babysitting, and Claude can keep going while you work on something else.\n\n## Write down your processes\n\nA good place to start is to write down the best-practices version of what you or your team already do.\n\nFor frontend, that's usually: run the dev server, open the browser, check the console for errors, click around as the user would and look out for things like layout shift or slow navigations. \n\nEvery domain has its own version. For each of those steps, there's likely a tool Claude can use for verification:\n\n![](https://pbs.twimg.com/media/HJ4Z1TJXAAE_fjG.jpg)\n\n## Encode your process as a skill\n\nOnce the process is clear, encode as much of it as possible as a skill. Install the `skill-creator` plugin, then ask Claude to interview you:\n\nIf you're struggling to put your process into words, ask Claude for the domain best practices first and let it show you what an end-to-end verification flow might look like.\n\nTaste and judgment are difficult to codify, but many checks have criteria Claude measure against:  a performance budget, an accessibility checklist, design system rules, good vs bad examples.\n\nFor example, a frontend skill might include instructions to capture a performance trace through the [Chrome DevTools MCP](https://github.com/ChromeDevTools/chrome-devtools-mcp/) or [Agent browser.](https://agent-browser.dev/react) \n\nOther checks are more qualitative than pass/fail, like comparing data against historical norms. For these, you can work with Claude to set a rubric for evaluating output.\n\n## Review the code before merging with a second agent\n\nEverything above happens inside the agentic loop. There's a second verification step, the moment before you merge, where you can ask another agent to review.\n\nA new agent won't carry the same biases as the one that wrote the code. It has its own context, and isn't influenced by the previous conversation. This isolation makes the review more honest, and catches things the first agent might have missed.\n\nA few options, from manual to automated:\n\n- /review (built in skill) - a quick single-pass read of a PR in your terminal.\n\n- [/code-review](https://claude.com/plugins/code-review) (installable plugin) - spins up several subagents in parallel, each reading the diff from a different angle, scores findings for confidence, and posts the result on the PR.\n\n- [Claude Code Review ](https://code.claude.com/docs/en/code-review)- a managed service that runs automatically on every PR through GitHub, for Team and Enterprise plans.\n\nWhichever you pick, it's helpful to have a last line of defense before merging to production.\n\n## Putting it together\n\nYou now have two layers: \n\n- verification that runs while Claude is building\n\n- a review before merge from an agent that didn't write the code. \n\nBoth belong to the same development lifecycle. Think about your current manual steps: you make a change, clean it up, confirm it works, commit, open a PR, get it reviewed, and watch CI.\n\nYou can roll all those steps into one workflow by writing a skill that calls other skills. For example, the Claude Code team has a skill they run when working on features, it bundles:\n\n1. The `/simplify` skill to clean up the diff\n\n1. A custom `/verify` skill to confirm the change works end-to-end\n\n1. A design check if the diff touched UI\n\n1. A step to open and subscribe to a PR\n\n1. A skill to watch CI and fix failures as they come in\n\nWhile your workflow may look different, creating feedback loops and bundling skills allows Claude to verify and execute more work end-to-end."}}