{"id":"2063645563241844823","url":"https://x.com/IBuzovskyi/status/2063645563241844823","text":"","author":{"name":"YanXbt","username":"IBuzovskyi","avatarUrl":"https://pbs.twimg.com/profile_images/2055755733002506240/uFECjFhT_200x200.jpg"},"createdAt":"Sun Jun 07 15:33:37 +0000 2026","engagement":{"replies":11,"retweets":39,"likes":277,"views":167685},"article":{"title":"Hermes Agent as a Personal AI Operating System","previewText":"Most current AI agent frameworks operate primarily as applications built on top of large language models. They can perform reasoning, call tools, and maintain context within a session, but they","coverImageUrl":"https://pbs.twimg.com/media/HKN9T-XWQAAg5lt.jpg","content":"Most current AI agent frameworks operate primarily as applications built on top of large language models. They can perform reasoning, call tools, and maintain context within a session, but they generally lack robust, native mechanisms for long-term structured persistence, workload isolation, autonomous expansion of their own capabilities, and reliable coordination across multiple components over extended periods of time.\n\nHermes Agent, developed by Nous Research, implements several architectural features that set it apart from many other agent frameworks. These include support for persistent memory across sessions, the ability to run multiple isolated execution contexts through profiles, a structured task orchestration system based on Kanban, mechanisms that allow agents to create and store reusable procedures derived from their own activity, and a messaging gateway that connects the agent to 27+ communication platforms.\n\nThis article examines Hermes through the lens of a Personal AI Operating System. The goal is to provide a detailed and honest analysis of its core architectural layers, how these layers interact in practice, and what the system can realistically offer as of June 2026, based on publicly available documentation and observed behavior.\n\n## 1. Core Layers of Hermes\n\nTo better understand the structure of Hermes, it is helpful to map its components to concepts from traditional operating systems.\n\n![](https://pbs.twimg.com/media/HKOCpBNWMAATjfB.jpg)\n\n## 1.1 Memory Architecture\n\nHermes maintains multiple distinct memory layers instead of attempting to keep all relevant information inside a single context window. The main types include:\n\n- Session Memory: Context that is active during a specific task or conversation. This type of memory is typically short-lived and tied to the current session.\n\n- Long-term Memory: Persistent storage of facts, insights, user preferences, and accumulated knowledge that survives across sessions and system restarts. Capped by configurable limits to prevent unbounded growth:\n\n- Skill Memory: Storage of structured, reusable procedures (skills) that the agent has created or refined based on past successful work. Stored as plain markdown files in ~/.hermes/skills/.\n\n- Session Recall: FTS5 full-text search with LLM summarization across the entire conversation history. Query any past session:\n\n> Remind me of every business idea we discussed last month.\nWhat was the competitor analysis we ran 3 weeks ago?\n\nThe multi-layered memory approach is one of the foundational elements that allows Hermes to function more like a persistent system than a typical conversational agent.\n\nExternal Memory Providers:\n\nFor use cases that require deeper intelligence beyond built-in memory, Hermes supports 8 external memory provider plugins:\n\n- Mem0 — knowledge graph + semantic retrieval. Loads only relevant entries per turn. 72% fewer tokens vs naive full injection.\n\n- Honcho — two-peer dialectic memory. Builds separate USER + AI observations. Self-host for PII-sensitive environments.\n\n- Hindsight, Holographic, RetainDB, ByteRover, Supermemory, OpenViking — additional providers with different architectures.\n\n## 1.2 Profiles as Isolated Execution Environments\n\nProfiles in Hermes allow users to create and run multiple separate instances of the agent on the same machine. Each profile maintains its own:\n\n- Configuration and model selection\n\n- Memory stores (both session and long-term)\n\n- Set of installed skills\n\n- Gateway connections and associated credentials\n\n- Session history\n\n- Telegram bot token\n\n- Cron jobs\n\n- State database\n\nEach profile becomes its own command:\n\nExample profile configurations:\n\nProfile Distribution:\n\nProfiles can be shared via git. A research agent that works can be distributed to anyone:\n\nAnyone can install it:\n\nThey fill in their own API keys. Skills, soul.md, and workflows transfer. Memories and sessions stay per-machine.\n\nProfile isolation is functional and useful for many real-world scenarios. However, it should not be understood as offering the same security or robustness guarantees as process isolation in traditional operating systems.\n\n## 1.3 Kanban as Orchestration and State Management\n\nThe Kanban system serves as the primary coordination and state management layer in Hermes. It is responsible for several important functions:\n\n- Creating and tracking tasks\n\n- Managing dependencies between tasks\n\n- Handling state transitions\n\n- Facilitating context transfer when one task or profile hands work off to another\n\n- Recording execution history and outcomes for each task attempt\n\nStatuses: Triage → To-Do → Ready → Running → Blocked → Done → Archived\n\nThe dispatcher runs every 60 seconds, auto-assigns tasks to available workers, tracks heartbeats, detects zombie processes, and manages retry budgets.\n\nMorning workflow example:\n\nOne particularly important feature is the \"Blocked\" state. When a task enters this state, execution pauses until a human provides input or unblocks it. This design makes human oversight a structured and native part of the workflow, rather than an external or ad-hoc intervention.\n\nBy treating tasks as first-class objects with preserved context and history, the Kanban layer helps reduce the information loss that commonly occurs during handoffs in multi-agent or multi-step workflows.\n\n## 1.4 Cron Jobs — The Scheduler\n\nCron jobs are time-based autonomous tasks written in plain English. No crontab syntax required.\n\nThis is the layer that transforms Hermes from a reactive tool into a proactive system. Useful information arrives before you ask for it.\n\nExamples of production cron jobs:\n\nCron jobs can target specific Telegram topics, specific profiles, and specific delivery platforms (Telegram, Discord, Slack, email).\n\nThe Web Dashboard provides a full cron management UI: create, edit, pause, resume, trigger manually, view last run time and next run time.\n\nIn OS terms, cron jobs are the scheduler daemon. They ensure the system does work on a predictable cadence without human initiation.\n\n## 1.5 /goal — Persistent Objectives (The Ralph Loop)\n\nA normal prompt asks Hermes for one response. /goal gives Hermes an objective to work toward across multiple turns until a judge model determines the goal is achieved.\n\nThe architecture:\n\n- Agent executes one turn toward the goal\n\n- Judge model evaluates: done or continue?\n\n- If continue: agent runs another turn\n\n- If done: goal completes, result delivered\n\n- Default max_turns: 20. Configurable per task type.\n\n- /goal resume resets the turn counter and continues\n\nThe structured /goal template:\n\nExample:\n\nThe interview hack — let Hermes write its own /goal:\n\nEvery /goal also becomes a Kanban card automatically, making progress visible on the board.\n\nCore commands:\n\n## 1.6 Skill Creation Mechanisms\n\nHermes includes functionality that allows agents to create and store reusable procedures (skills) based on their own activity. When an agent successfully completes certain types of work, it can identify patterns, formalize them, and save them for future use.\n\nSkills are stored as plain markdown files in ~/.hermes/skills/. They are transparent, readable, and editable. No black box.\n\nExample — a content creation skill:\n\nView all skills:\n\nHermes ships with 60+ built-in tools across terminal, web, browser, vision, image generation, TTS, and code execution. Skills layer on top of those tools to create full workflows.\n\nIn v0.16.0, the default skill set was trimmed to what you actually need — leaner out of the box, less noise. NVIDIA skills joined the trusted Skills Hub taps, bringing official CUDA-X, Omniverse, NeMo, and TensorRT-LLM skills into the catalog.\n\nThe compounding effect:\n\nAgents with 20+ self-created skills finish similar future tasks approximately 40% faster than fresh instances (per Nous Research observations). This compounding is the core differentiator of Hermes.\n\nIn practice, the maturity, reliability, and degree of autonomy of skill creation vary significantly. In many cases, especially during early usage or with complex tasks, human review and curation of created skills remain important for achieving high-quality results.\n\n## 1.7 Autonomous Curator — The Garbage Collector\n\nAs skills accumulate over weeks and months of usage, redundancy, outdated procedures, and bloat become real concerns. The Autonomous Curator addresses this.\n\nThe Curator is a background process that runs on a configurable schedule (default: 7-day cycle). It:\n\n- Identifies redundant or overlapping skills\n\n- Prunes skills that are no longer relevant\n\n- Compresses and consolidates related procedures\n\n- Optimizes the skill library for retrieval efficiency\n\n- Revises skill descriptions for better searchability\n\nIn OS terms, the Curator functions as a garbage collector and defragmenter. It prevents the skill filesystem from degrading over time.\n\nThis is particularly important because Tool Search (covered below) relies on skill names and descriptions for retrieval. Poorly maintained descriptions degrade search accuracy.\n\nFrom the NVIDIA NemoTron Labs live stream, Karan from Nous Research confirmed: \"The Hermes Curator is an autonomous background feature that manages, cleans, optimizes, revises, improves, and compresses your skill library all the time.\"\n\n## 1.8 Tool Search — Dynamic Linker\n\nWhen you connect 15+ MCP servers, their tool schemas consume context window space on every turn — even when most tools are irrelevant to the current task.\n\nTool Search replaces all MCP/plugin schemas with 3 lightweight bridge tools:\n\n- tool_search — finds the right tool by name and description (BM25 retrieval)\n\n- tool_describe — loads its full schema on demand\n\n- tool_call — executes it\n\nEach bridge tool costs approximately 300 tokens vs thousands for the full schema array.\n\nThree modes: auto (recommended), on (always active), off (disabled).\n\nAccuracy on Opus 4 went from 49% to 74% with Tool Search enabled (Anthropic's own tests).\n\nCore Hermes tools (terminal, memory, browser, web search) are never deferred. They stay loaded on every turn.\n\nIn OS terms, Tool Search functions as a dynamic linker. Instead of loading every shared library at startup, the system loads them on demand when the running process needs them. This preserves memory (context window) for actual work.\n\n## 1.9 Gateway — The Network Stack\n\nThe Gateway is the layer that makes Hermes accessible from anywhere. One gateway process connects the agent to 27+ messaging platforms simultaneously:\n\nTelegram, Discord, Slack, WhatsApp, Signal, SMS, Email, Matrix, Mattermost, Microsoft Teams, Teams Meetings, Google Chat, LINE, DingTalk, Feishu/Lark, WeCom, WeChat, QQ, Yuanbao, BlueBubbles (iMessage), SimpleX, ntfy, Open WebUI, Home Assistant, MS Graph Webhooks, and more.\n\nThe gateway runs as a single process. Approval buttons are native in Telegram and Slack — the agent can request human confirmation before executing sensitive actions.\n\nSSEP — Structured Stream-Event Protocol (v0.16.0+):\n\nThe agent no longer streams raw text and hopes platforms can render it. Instead:\n\n1. Agent emits typed events only: MessageChunk, MessageStop, ToolCallChunk, ToolCallFinished, Commentary, LongToolHint, GatewayNotice\n\n1. Gateway router routes each event to the right platform adapter\n\n1. Each adapter renders what it can and silently drops what it can't\n\nTelegram gets animated drafts in MarkdownV2. iMessage drops tool-chrome the user doesn't need to see. Each event is immutable. Ordering is preserved per stream.\n\nIn OS terms, the Gateway is the network stack and SSEP is the display server / compositor. The agent produces a universal output format; the rendering layer adapts it per display.\n\nRemote access:\n\nThe Desktop App can connect to a Hermes backend running on another machine (VPS, home server, behind Tailscale):\n\nOne agent running on a VPS. Managed from Desktop on your laptop, CLI via SSH, and Telegram on your phone. All hitting the same memory, skills, and sessions.\n\n## 1.10 Voice Mode — I/O Layer\n\nVoice mode provides speech input and output across CLI and all messaging platforms.\n\nFive speech-to-text providers:\n\n- Local faster-whisper (free, runs on device)\n\n- Groq\n\n- OpenAI Whisper\n\n- Mistral Voxtral\n\n- xAI Grok STT\n\nFive text-to-speech providers:\n\n- Edge TTS (free, default)\n\n- ElevenLabs\n\n- OpenAI\n\n- NeuTTS (local, free)\n\n- MiniMax\n\nWorks in Telegram voice messages, Discord voice channels (live voice conversations with the agent), WhatsApp, Signal, Slack, and CLI.\n\nIn OS terms, voice mode is the I/O layer — providing alternative input/output methods beyond text.\n\n## 1.11 Security Layer\n\nHermes provides multiple security primitives for production deployments:\n\nLayer 1 — Bitwarden Secrets Manager (Credential Management)\n\nOne bootstrap token in .env. All real credentials live in Bitwarden. Every Hermes instance pulls secrets at startup. Rotate a key once in the web app — every instance picks it up on next restart. Free tier.\n\nLayer 2 — iron-proxy Egress Firewall (Credential Protection)\n\nInstead of injecting real credentials into the sandbox, Hermes gives the agent opaque proxy tokens. iron-proxy intercepts at the network boundary, swaps for the real credential, forwards the request. The sandbox never holds the actual key.\n\nLayer 3 — Promptware Defense\n\nProtection against Brainworm-class prompt injection attacks. The agent detects and rejects attempts to override its instructions through malicious content in processed documents, web pages, or tool outputs.\n\nv0.16.0 added: CVE-2026-48710 Starlette pin, SSRF off-loop hardening, and subprocess credential stripping. 16 security-tagged issues closed in this release alone.\n\nLayer 4 — OpenShell (Enterprise, via NVIDIA partnership)\n\nFor enterprise deployments, Hermes integrates with NVIDIA OpenShell and Microsoft security primitives. OpenShell provides:\n\n- Per-user policy gates controlling what the agent can access\n\n- Token masking at egress (agent never sees real credentials)\n\n- Hot-swappable policies without restart\n\n- Admin observability and audit trails\n\nFrom the NVIDIA NemoTron Labs live stream, Karan from Nous Research: \"The ability for me to say, as smart as you might get, there's no way you're getting through this particular gateway, there's no way I'm going to allow you to use the skill that you made because I'm not supervising you in the particular manner that I want to.\"\n\n## 1.12 Extensibility — Skills Hub and MCP Catalog\n\nSkills Hub (agentskills.io):Community-contributed skills. Browse, search, install directly from the hub through the dashboard or CLI.\n\nMCP Catalog:Curated by Nous Research. Every entry via merged PR. 19,932 skills in the catalog.\n\nNVIDIA Skills:Official NVIDIA agent skills integrated into the Skills Hub. CUDA-X libraries, Omniverse workflows, NeMo training and inference, TensorRT-LLM optimization, CUDA-Q quantum programming. Mirrored daily from NVIDIA product repos.\n\nIn OS terms, the Skills Hub and MCP Catalog function as a package manager. Users can discover, install, and manage capabilities without building them from scratch.\n\n## 1.13 Interface Layer\n\nHermes can be accessed and managed through multiple surfaces:\n\nCLI (Command Line Interface):Full feature parity. Every command, every tool, every configuration option available. The most powerful interface.\n\nTUI (Text User Interface):Rich terminal interface with panels and navigation. Middle ground between CLI power and visual feedback.\n\nDesktop App (v0.16.0 — \"The Surface Release\"):Native Electron app for macOS, Windows, and Linux. Built across 100 PRs and 159 commits in a single week. First demoed at Jensen's GTC keynote.\n\n- Side-by-side preview pane\n\n- Built-in file browser\n\n- Drag-and-drop files directly into chat\n\n- Integrated voice mode\n\n- Inline model picker in the status bar (fuzzy-searchable)\n\n- Concurrent multi-profile sessions\n\n- Settings UI for models, API keys, tools\n\n- Profile management\n\n- Artifacts viewer (every file Hermes creates)\n\n- In-app self-update\n\n- Full Simplified Chinese translation\n\n- Same HERMES_HOME directory as CLI — sessions transfer seamlessly\n\nDownload: hermes-agent.nousresearch.com/desktop\n\nIf Hermes is already installed:\n\nWeb Dashboard:\n\n- Models, cron jobs, skills, profiles, kanban board\n\n- Full browser-based admin panel: MCP catalog, messaging channels, credentials, webhooks, memory management\n\n- Pluggable authentication: OIDC or username/password login\n\n- Fully extensible with themes (YAML) and plugins (JS + Python)\n\n- No data leaves localhost by default\n\nMessaging Platforms:27+ platforms through the gateway (covered in section 1.9).\n\n## 2. The Compounding Effect\n\nThe compounding nature of Hermes is its most distinctive property and the primary reason it functions more like an operating system than a typical agent.\n\nDay 1: Hermes knows nothing about you. Every task requires full instructions. You explain your workflow, your preferences, your tools. The agent is a blank slate.\n\nWeek 2: Hermes has accumulated memory about your projects, preferences, and working style. It stops asking questions you've already answered. Tasks that required 10 messages now require 3.\n\nMonth 1: Hermes has created 15-20 skills from completed work. Your content workflow, your research process, your inbox triage method — each encoded as a reusable procedure. Tasks that took the agent 20 turns on day 1 now complete in 5.\n\nMonth 3: With 40+ skills and deep memory, the agent operates at a level that cannot be replicated by switching to a better model with a blank context. The accumulated skills, memory, and learned preferences create a compounding advantage that grows with every session.\n\nThe math:Agents with 20+ self-created skills finish similar future tasks approximately 40% faster than fresh instances. This improvement compounds — each completed task potentially creates or refines a skill that accelerates future work.\n\nWhat this means in practice:\n\nFrom the NVIDIA NemoTron Labs live stream, Johnny from Nous Research described his actual workflow: \"Every morning I initiate a planning session. For every planning session I get a date-key file with things I want to do. The skill looks back for the week and tells me what I've been lacking on or if there's something I said was urgent and I haven't gotten to. At 11pm a cron fires and tells me: did you do what you wanted to do.\"\n\nThis is a system that evolved through use. The morning planning skill, the date-key filing system, the weekly retrospective — none of these were pre-built. They emerged from Johnny's usage patterns and became permanent infrastructure.\n\nKaran, who trained the first Hermes models, uses it for ML ablations: \"I really hate doing ablations. It's tedious, time consuming. But it needs to be done. That's how you do science. Hermes does it now. And I don't have to do it.\"\n\nThe compounding effect is the core argument for treating Hermes as infrastructure rather than as an application. Applications provide the same value on day 90 as day 1. Infrastructure improves with investment.\n\n## 3. Token Economics — What It Actually Costs\n\nRunning Hermes as a personal OS has concrete costs. Understanding them is important for sustainable use.\n\nThe agent runtime:Hermes itself is free and open source (MIT license). The cost comes from model inference and infrastructure.\n\nInfrastructure options:\n\n![](https://pbs.twimg.com/media/HKOE0PlW0AInCtv.png)\n\nMinimum VPS specs: 2 vCPU, 2GB RAM for light use. \n\nRecommended: 4 vCPU, 8GB RAM for heavy use. No GPU needed — Hermes calls APIs, not the model directly.\n\nModel provider options:\n\n![](https://pbs.twimg.com/media/HKOE3xoWQAA1NKm.png)\n\nX API costs (pay-per-use since February 2026):\n\n![](https://pbs.twimg.com/media/HKOE7c8WgAARw2t.png)\n\nAlternative: OpenTweet MCP at $5.99/month flat.\n\nRealistic monthly budgets:\n\nThe token estimates below are approximations based on typical session patterns. Actual consumption depends on model, task complexity, tool output volume, and configuration. Use /usage inside Hermes to measure your real numbers\n\nRunning the full content system described in this article (5 daily cron jobs, 2 content sessions/day with /goal, daily sub-agent research, kanban tracking) consumes approximately 10-11M tokens/month. Here is what that costs depending on your model strategy:\n\n![](https://pbs.twimg.com/media/HKOFBmoWQAAIJd1.png)\n\nThe same system that costs $27/month on GPT-5.5 costs $250/month on Claude Opus. A 10x difference for the same cron jobs, the same /goals, the same sub-agents.\n\nWhy this matters: Hermes is model-agnostic. You pick the model per profile, per task. Routine cron jobs that scan X for trending posts do not need Opus-level reasoning. A $0 GPT-5.5 call does the same job. Reserve the expensive model for the one /goal per day where writing quality or deep reasoning makes a real difference.\n\nThe cheapest complete path:\n\n![](https://pbs.twimg.com/media/HKOG3jlXgAABA_A.png)\n\nThat is a 24/7 autonomous agent with 5 daily cron jobs, persistent memory, self-improving skills, kanban task tracking, and Telegram access from your phone.\n\nCompare: a virtual assistant doing the same work costs $500-2,000/month. A content agency costs $3,000-8,000/month.\n\nNote on Nous Portal: The Plus tier ($20/month, $22 credits) works well for light usage (1-2 cron jobs, a few sessions per day). For the full content system described here, the Super tier ($100/month, $110 credits) or bring-your-own-keys is more realistic.\n\nToken optimization (6 methods to reduce costs):\n\n1. Compact file reader — 14% fewer tokens per file read (automatic in latest version)\n\n1. Prompt caching — ~75% reduction on multi-turn sessions (Anthropic models only)\n\n1. /compress — summarizes session history, drops overhead\n\n1. Tool Search — loads schemas on demand instead of upfront\n\n1. Subagent delegation — each subagent in own context, only summaries return\n\n1. Retrieval-based memory — 72% fewer tokens vs naive full injection\n\nFastest path to a working agent:\n\nOne OAuth covers model + web search + image generation + TTS + cloud browser. No separate API keys needed.\n\n## 4. How The Layers Chain Together\n\nThese layers compound when stacked. Here is one chain running end-to-end:\n\nOne day. Nine architectural layers fired. Two posts shipped. Zero manual research. Total API cost: approximately $2-4.\n\n## 5. Key Characteristics\n\nPersistence\n\nHermes is explicitly designed to retain information across sessions through its memory system. This allows accumulated context and created skills to persist over time, rather than being lost after each session or restart.\n\nIsolation and Coordination\n\nThe combination of Profiles and Kanban allows Hermes to support both isolation and structured collaboration. Profiles provide separation between different workloads, while Kanban enables controlled handoff and context transfer when collaboration is required.\n\nSelf-Improvement Mechanisms\n\nThe presence of skill creation functionality provides Hermes with a pathway for structural self-improvement. Unlike systems that rely solely on prompt engineering or manual tool definitions, Hermes can expand its own capabilities based on usage patterns. The Autonomous Curator ensures the skill library stays clean and efficient over time.\n\nHuman Oversight as a Native Feature\n\nHuman intervention is implemented as a first-class concept through the Blocked task state in Kanban and approval buttons in Telegram and Slack. This allows the system to pause execution cleanly, preserve context, and resume intelligently once the required input is provided.\n\n## 6. Practical Considerations\n\nWhen using Hermes as infrastructure rather than as a simple conversational tool, several practical factors become important:\n\n- The long-term value of the system depends heavily on how memory and created skills are managed, curated, and maintained over time. The Autonomous Curator helps, but periodic human review improves quality.\n\n- Profile isolation is useful but requires deliberate configuration. It is not automatic and does not provide the same guarantees as traditional process isolation.\n\n- The quality and usefulness of autonomously created skills can vary significantly. In many cases, especially early on, human review improves outcomes.\n\n- Resource consumption, particularly model context windows and inference costs, should be actively monitored. Use /usage and /compress regularly. Enable Tool Search for heavy MCP setups.\n\n- The effectiveness of the overall system is highly dependent on thoughtful configuration and ongoing management, rather than emerging automatically from simply running the software.\n\n- Token economics should be understood before committing to heavy usage patterns. Start with Nous Portal Plus at $20/month and scale from there.\n\nToken-Aware Configuration\n\nRunning Hermes as a full OS with multiple profiles and cron jobs consumes tokens on every session startup (system prompt + memory + skills index). Without optimization, costs can grow faster than expected.\n\nUse the right model for the right job:\n\nNot every task needs the strongest model. Matching model to task type is the single biggest cost lever.\n\nUse frontier models (Opus, GPT-5.5) for complex /goals. Use cheaper models for daily cron jobs and routine triage. One switch cuts your monthly bill in half.\n\nLower memory limits for lightweight profiles:\n\nDefault memory injection is 2,200 chars (~800 tokens) per turn. In a 50-turn /goal session, that is 40K tokens spent repeating memory. For profiles that don't need deep personal context:\n\nSet realistic max_turns:\n\n50 turns on Opus can cost $5-12 per session. Set max_turns per profile, not globally. Research profiles rarely need more than 20.\n\nEnable all 6 token optimizations:\n\nPlus: prompt caching (automatic on Anthropic), /compress for long sessions, subagent delegation for parallel work.\n\nUse cheap auxiliary models for side-jobs:\n\nHermes offloads compression, vision, web summarization, approval scoring, tool routing, and session titles to auxiliary models. Each slot is configurable independently. Use a cheap fast model for these while keeping your expensive model for main work:\n\nThis means /compress and auto-compression run on cheap tokens, not on your main model's pricing.\n\nTune the compression threshold:\n\nLower this to 0.30-0.40 for more aggressive compression. Sessions stay lighter, fewer tokens accumulate before the compressor fires.\n\nLossless Context Management (LCM):\n\nThe default compressor is lossy — it summarizes and drops older context. LCM is a plugin alternative that preserves all context without loss while still optimizing token usage. Available via hermes plugins → Context Engine.\n\nMonitor with /usage:\n\nRun this regularly. Compare token counts across sessions. If a cron job burns more tokens than expected, simplify its prompt or switch it to a cheaper model.\n\nCost scaling by setup complexity:\n\nThese are estimated ranges. Run /usage in Hermes to compare against your actual numbers.\n\n![](https://pbs.twimg.com/media/HKOIDN3XMAEjtY3.png)\n\nThe cheapest path: run everything through GPT-5.5 via Codex ($20/month ChatGPT subscription, inference included). Reserve Claude or Opus for the sessions where reasoning quality makes a measurable difference in your output.\n\n## 7. Current Limitations (as of June 2026)\n\nHermes possesses several meaningful architectural strengths, but it remains an evolving system rather than a fully mature personal operating system:\n\n- The native Desktop App significantly improves accessibility, but it does not yet provide full feature parity with the CLI/TUI for all tool interactions, particularly complex browser automation and certain local integrations.\n\n- Running large numbers of concurrent agents or very long-running workflows can place substantial pressure on model context windows and inference resources. Careful resource management is often required.\n\n- Profile isolation is practical and functional for many use cases, but it does not offer the same level of robustness or fault isolation as process isolation in traditional operating systems.\n\n- Autonomous skill creation is a promising direction, but its maturity and reliability remain variable. High-quality, reusable skills often still require human curation, particularly for complex or high-stakes tasks.\n\n- Auto-compaction during long sessions can cause context loss. The Autonomous Curator and session recall are partial solutions. Keeping full thread in context for the window's life prevents silent drift but limits session length.\n\n- Some advanced tool integrations may still be more stable when used through the CLI/TUI rather than through the Desktop App or messaging interfaces.\n\n- The SSEP gateway protocol is new (v0.16.0). Edge cases in per-platform rendering may exist for less common messaging platforms.\n\nThese limitations are primarily related to implementation maturity rather than fundamental architectural shortcomings. The project continues to develop actively. The v0.16.0 \"Surface Release\" alone included 874 commits, 542 merged PRs, and contributions from 170 community members. The prior v0.15.0 \"Velocity Release\" included 1,302 commits, 747 merged PRs, and 321 contributors.\n\n## 8. How Hermes Compares to Other Agent Frameworks\n\nThe most common question when evaluating Hermes: how does it compare to Claude Code, OpenClaw, and CrewAI? The answer is that they solve different problems and are built on different philosophies.\n\n![](https://pbs.twimg.com/media/HKOIeOBWgAANIGG.jpg)\n\n![](https://pbs.twimg.com/media/HKOImkBXoAA3DVk.png)\n\nThe mental model that works (from builders who use all three):\n\nClaude Code is your daily driver at your desk. Best raw coding agent available. If the job is \"write code, refactor code, debug code, understand this codebase,\" Claude Code wins.\n\nHermes Agent is your 24/7 infrastructure. It runs while you sleep, manages multiple workloads through profiles, compounds through skills and memory, and reaches you on Telegram from anywhere.\n\nOpenClaw is your chat-first assistant. Largest marketplace, easiest managed hosting ($3/month), strongest non-technical user experience.\n\nCrewAI is your orchestration framework. When you need multiple specialized agents working together on a defined pipeline in Python. Not a standalone agent — a framework for building multi-agent systems.\n\nOne benchmark that illustrates the difference:\n\nAn independent test ran the same 18 prompts through Claude Code (Opus 4.7), OpenClaw (Sonnet 4.6), and Hermes Agent. Hermes won 14 of 18. The 4 it lost were raw coding tasks where Claude Code's codebase understanding is unmatched. The 14 it won were tasks where memory and context from previous sessions made the difference.\n\nThe takeaway: Hermes wins when history matters. Claude Code wins when code depth matters. They are complementary, not competing.\n\nHermes ships hermes claw migrate — a built-in migration command from OpenClaw. When a product ships a named migration command for a specific competitor, the positioning is clear.\n\n## 9. Start Here\n\nIf you read this entire article and want to start, here are three paths based on your situation.\n\nPath 1 — I have 15 minutes (fastest to first result):\n\nPath 2 — I have an evening (full personal setup):\n\n1. Install Hermes and run hermes setup --portal\n\n1. Connect Telegram (BotFather → token → paste)\n\n1. Create your first profile: hermes profile create work\n\n1. Write a soul.md that defines how the agent should behave\n\n1. Set 3 cron jobs (morning briefing, competitor check, daily review)\n\n1. Run your first /goal with the structured template:\n\n7. Open the dashboard: hermes dashboard\n\n8. Review skills after a week. Delete weak ones. Refine strong ones.\n\nPath 3 — I want the full OS (weekend project):\n\n1. Spin up a Hetzner CX22 VPS (~$7/month)\n\n1. Install Hermes on the VPS via SSH\n\n1. Run hermes setup --portal\n\n1. Connect Telegram gateway: hermes gateway start\n\n1. Create 3-4 profiles (content, research, ops, code)\n\n1. Write soul.md for each profile\n\n1. Set up cron jobs per profile\n\n1. Configure Kanban for cross-profile task tracking\n\n1. Install the Desktop app on your laptop\n\n1. Connect Desktop to the remote backend via auth gate\n\n1. Enable Tool Search in config.yaml\n\n1. Lower memory char limits for token optimization\n\n1. Set up Bitwarden Secrets Manager for credentials\n\n1. Run for one week. Review skills, memory, and token usage.\n\n1. Iterate. The system compounds from here.\n\nPriority order if overwhelmed:Start with cron jobs (#3 in 10-hack article), /goal structure (#4), and skills (#8). These three setups change how Hermes feels overnight.\n\n## Conclusion\n\nHermes Agent represents one of the more architecturally ambitious attempts among current open-source agent frameworks to move beyond simple conversational or tool-calling interfaces. Its combination of persistent memory, profile-based isolation, structured task orchestration through Kanban, plain-English cron scheduling, persistent /goal objectives, dynamic tool loading, multi-platform gateway access, voice interaction, production security primitives, and mechanisms for creating reusable procedures gives it characteristics that align more closely with the concept of a personal operating system than most other systems available today.\n\nKaran from Nous Research, who trained the first Hermes models, described it simply: \"Hermes Agent is the ability to take a language model and realize that everything that happens on your computer is text in or text out. Hermes Agent lets you do that with all the integrations on your computer. It can use your browser, your apps, everything you do on the computer. It's a general automator, general simulator of computer actions and digital actions.\"\n\nAt the same time, it is important to maintain realistic expectations. Hermes is not yet a fully mature personal AI operating system. Its architectural direction is promising, but real-world effectiveness still depends heavily on careful configuration, ongoing management, and an honest assessment of feature maturity.\n\nWhen used thoughtfully as infrastructure, Hermes can serve as a foundation for building long-term, evolving AI-assisted workflows that compound in capability over time. The meaningful difference lies in how deliberately the system's capabilities and limitations are understood and utilized.\n\nThe agent is ready. The stack is ready. The value compounds with use.\n\n## Related Articles\n\n- [HERMES AGENT: THE COMPLETE GUIDE](https://x.com/IBuzovskyi/status/2059675518966894767?s=20) — installation, models, dashboard, use cases, security\n\n- [The Complete Hermes /goal Playbook — 21 Workflows](https://x.com/IBuzovskyi/status/2059303967767593247)\n\n- [Hermes /goal — The Full Guide](https://x.com/IBuzovskyi/status/2056764150936748082)\n\n- [How to Make Hermes + xurl Actually Work as a System](https://x.com/IBuzovskyi/status/2057114309616885997)\n\n- [Hermes x Bitwarden — The Security Stack](https://x.com/IBuzovskyi/status/2057914816015249515)\n\n- [10 Hermes Agent Setups](https://x.com/IBuzovskyi/status/2062101068842975409?s=20)\n\nExpanded versions and additional Hermes content on Substack: [https://substack.com/@yanxbt](https://substack.com/@yanxbt)\n\nThis article is based on publicly available Hermes Agent documentation (v0.16.0 \"The Surface Release\"), the NVIDIA NemoTron Labs live stream, and observed system behavior as of June 2026.\n\n@NousResearch @Teknium"}}