{"id":"2067345406699393176","url":"https://x.com/BrainsAndTennis/status/2067345406699393176","text":"","author":{"name":"Peter Wang","username":"BrainsAndTennis","avatarUrl":"https://pbs.twimg.com/profile_images/1550254914236022785/0bvF2WY-_200x200.jpg"},"createdAt":"Wed Jun 17 20:35:29 +0000 2026","engagement":{"replies":9,"retweets":15,"likes":197,"views":52946},"article":{"title":"Filesystem-Pilling Your Vertical Agent","previewText":"\"Filesystem and bash is all you need\" has been the de facto way to build agents for at least six months now. Longer, if you date it to the mainstream arrival of Manus. And yet, concrete case studies","coverImageUrl":"https://pbs.twimg.com/media/HLCq7HEacAAMwQS.jpg","content":"\"Filesystem and bash is all you need\" has been the de facto way to build agents for at least six months now. Longer, if you date it to the mainstream arrival of [Manus](https://manus.im). And yet, concrete case studies of what filesystem and bash actually buys you on a real vertical are few and far in between. So I'll go through how we enabled [Shortcut](https://shortcut.ai/) for one use case: data enrichment. \n\n![](https://pbs.twimg.com/media/HLCoKU4aUAAX5-b.jpg)\n\nYou have a table in which the rows are entities (companies, candidates, leads, etc) and a set of columns you wish were filled in: a company's headcount and funding stage, a candidate's degree and notable work, an investor's recent deals. Enrichment is going out to the web and external sources, finding those facts, recording where each one came from, and writing them back into the table. \n\nPeople pay a lot for this. Sales teams enrich leads with data to decide who to contact and what to open with, recruiters enrich candidate lists, investors enrich deal pipelines and run diligence, and so on. [Clay](https://www.clay.com), essentially spreadsheet-driven enrichment, raised $100M at a $3.1B valuation in August 2025.\n\nThe TLDR: Shortcut solves enrichment not by designing a purpose-built enrichment tool but by filesystem-pilling the tools it already had. \n\n## The basics of being filesystem-pilled\n\nLet's go through the basics. Filesystem-pilling means giving the agent a real filesystem and shell and routing its work through files on disk instead of through other means. Data lands in files, the agent reads and writes them with bash, and what flows through the transcript is mostly pointers to that data rather than the data itself. The immediate reason to do it is that it saves tokens on both receiving inputs and generating outputs. Three examples:\n\nBash\n\nBash is powerful precisely because the agent can run anything, but that means you can't control the output either. The same command might print 5 lines or 5,000. So don't try to bound it up, let it run and catch the overflow. When output is too big for context, it doesn't get pasted back into the transcript; it streams to disk, and the model gets a truncated head plus a path. [pi](https://github.com/earendil-works/pi)'s [bash tool](https://github.com/earendil-works/pi/blob/main/packages/coding-agent/src/core/tools/bash.ts) is a clean reference: it caps output at 50 KB / 2000 lines and appends the full path when it spills.\n\nMCP\n\nMost MCP tool outputs are abominations. 2 MB of deeply nested JSON. So don't let it into context. Persist it to disk and use a JSON parser to hand back a path plus the shape of the data, not its values. From the shape and a couple of example records, the model can then go straight after the values it actually wants: jq the two fields it needs, filter to the 12 rows that match, and count the rest, instead of reading 4182 contacts to use a handful.\n\nSubagents\n\nSay you have a highly parallelizable task made for subagent delegation, like research 200 companies and find their latest funding round, employee count, and primary product. The naive move is to type a full prose brief into every subagent call, reinflating nearly the same paragraph 200 times in the parent's output. The fix is one small change to the subagent tool: let it accept its query as a .txt file, not just an inline string. Once the query is a file, the agent can compose all 200 programmatically with bash instead of writing them out by hand.\n\nThe same move runs through all three: keep the bulk out of the model and let files carry it instead. It's faster and it's cheaper, and the tokens you don't spend on bulk are tokens left for reasoning: a leaner context is a smarter agent.\n\n## The case study: data enrichment\n\nThe basics above help any agent. But does filesystem-pilling still pay off on a bespoke vertical, like data enrichment?\n\nThe obvious solution\n\nMechanically, data enrichment is always same loop: start with a table, fill the missing columns from external research, attach sources, normalize the answers, export.\n\nThe obvious solution mirrors the grid: put an agent in every cell. [Paradigm ](https://www.paradigmai.com)builds an AI-native spreadsheet where each cell holds an agent that searches the web and fills its value; [Clay](https://www.clay.com) enriches per row; [Quadratic](https://www.quadratichq.com), [Freckle](https://www.freckle.io), among others each ship a variant of the same idea. The execution model is the UI:\n\nIt demos beautifully - cells lighting up one by one - and it's a clean product abstraction. It is also a tool that solves exactly one shape of problem. Take a real table:\n\n![](https://pbs.twimg.com/media/HLCm_tubEAA1S6R.jpg)\n\nEvery column on this table breaks the cell model, and each breakage is its own kind:\n\n- The labels aren't flat. Comp → Base and Comp → Equity are one logical thing split by a nested header. A tool keyed on a single (row, column) label has nowhere to put that hierarchy, and the two cells get researched as if they were unrelated.\n\n- The label isn't the query. \"Notable projects of B. Okoro\" is a terrible search query; \"open-source projects and conference talks by B. Okoro, the candidate who led a payments rewrite\" is a good one — but composing that query needs the other cells in the row, which the cell agent never sees.\n\n- Independent answers don't standardize. Look down any column. highest degree is four formats for one fact. Each cell agent answered in isolation, so nothing ever saw the column as a whole to make it uniform.\n\nThese aren't separate bugs but one root cause: a cell-agent tool confines the agent's intelligence to \"rows\" × \"columns\" × an exact label format. The research problem isn't cell-shaped, but the tool forces it to be.\n\nThe fix: filesystem-pill a plain web search\n\nThe fix isn't a better enrichment tool. It's not building one. Instead, we filesystem-pilled a plain web search.\n\nShortcut's web_search does one humble thing: it takes a .txt file of queries, one per line, plus one output schema for the whole batch, runs them concurrently, and writes one JSONL row per result. It knows nothing about spreadsheets, rows, or cells. So the agent composes the job in code:\n\nwhich produces a plain file, one self-contained query per line:\n\nthat the agent hands to the tool:\n\nThose few lines fix two of the three failures outright. The three columns collapse into one lookup per person, so each row is a real query, name, title, and company woven in, instead of a bare cell label. And one schema, applied across the batch, pins the answer shape for every row at once. That's 3x savings in lookups.\n\nThe third failure, standardization, falls to the next step. The results file comes back to the same agent loop that launched the search, which now reads it with the whole column in view: it collapses PhD and Doctorate into one form, coerces a stray \"8 years\" to 8, reissues the queries that came back thin, and iterates until the column is consistent by construction. This is the stage the cell model structurally cannot have, because no cell agent ever sees past its own cell.\n\nAnd because every step left a file behind, the workflow gains two properties: it's rerunnable and auditable. Rerunnable: fixing a later stage reprocesses the saved JSONL instead of re-running the expensive web research. Auditable: every intermediate, the composed queries, the raw results, the snippets, even the sources that were rejected, is a real file the user can read, and download.\n\n## The lesson\n\nThe instinct, faced with a vertical like spreadsheet enrichment, is to build a spreadsheet-specific tool: an agent per cell, an enrichment engine, the grid baked into the product. It demos well and it covers the common case.\n\nDon't build the vertical tool. Turn the vertical problem into a general one and let the agent compose the pipeline. The edge cases stop being special. The same set of primitives that handled enrichment handles the next vertical too, because nothing in it was about enrichment. \n\nFind the place you were about to build the bespoke tool and ask what the general version is. More often than you'd expect, the best thing you can build for a vertical is the thing that isn't specific to it at all."}}