Your AI Coding Agent Has Amnesia. Here's the Fix.

Every Session Starts From Zero

Every time you open a new conversation with Claude Code, Cursor, or Windsurf, you start from zero.

Your agent doesn’t know why you chose Postgres over MongoDB three months ago. It doesn’t know that your team decided to never use any in TypeScript after a painful incident in production. It doesn’t know that “the auth module” means the one in /services/auth and not the deprecated one still sitting in /legacy. It doesn’t know that error responses always use your custom envelope, not whatever the framework defaults to.

You know all of this. It lives in your head, or in some Confluence page nobody reads, or scattered across a Slack thread from eight months ago.

So every session, you re-teach the agent. Again. And again. And again.

That’s not a prompting problem. That’s a memory architecture problem.

I Gave a Talk About This at Our AI Space

Every two weeks, a group of us gets together to share what we’re learning, experimenting with, or shipping around AI. No hype, no sales pitches — just engineers talking shop. This week I presented Agent OS, a lightweight system for solving exactly this problem. This post is the written version of that talk.

What Is Agent OS?

It’s not a new AI coding assistant. It doesn’t replace Claude Code or Cursor. It’s the layer underneath them — a system for defining, managing, and injecting your coding standards into AI-powered development workflows.

The pitch is simple: instead of rediscovering your team’s conventions every session, Agent OS captures them once as Markdown files and injects them into context when needed.

Version 3 goes further — it can discover those standards automatically by scanning your existing codebase, extracting patterns that would otherwise only exist as tribal knowledge.

The Core Loop

Discover → Inject → Build → Refine

That’s it. Four steps, and they map to four slash commands in Claude Code:

/discover-standards — Scans your codebase and surfaces patterns worth documenting. Not obvious stuff like “we use React” — the weird, opinionated, tribal stuff. The kind of thing a new developer would get wrong on their first PR. It walks you through confirming, editing, or skipping each pattern, then writes Markdown standards files.

/inject-standards — Pulls relevant standards into your current context. It can auto-suggest based on what you’re working on, or you can be explicit: /inject-standards api/response-format. Context window efficiency is baked in — there’s an index.yml that holds descriptions for every standard, so the command can match what’s relevant without dumping your entire standards library into context.

/shape-spec — Run this in plan mode before implementing something significant. It walks you through structured shaping questions, creates a spec folder with a plan, scope decisions, references, and the relevant standards — all persisted as files. Months later, someone can find that folder and understand exactly what was built, why, and what constraints were in play.

/plan-product — Generates foundational product docs: mission, roadmap, tech stack. Feeds into /shape-spec so feature planning stays aligned with the bigger picture.

The Bit That Actually Impressed Me

Most developer tooling solves the obvious problem. Agent OS solves the harder, less obvious one: the index.

Here’s the issue: if you have 30 standards files and inject all of them before every task, you’re burning context window unnecessarily. Most of those standards aren’t relevant to what you’re doing right now.

The solution is index.yml — a file that holds short descriptions for every standard. When you run /inject-standards, it reads the index, matches descriptions against your current context, and only suggests the 2-5 files that are actually relevant. It never reads the full standards files until you confirm you need them.

It’s a retrieval pattern. Same instinct as building a RAG system: don’t dump everything into context, embed metadata for matching, fetch the actual content only when you’ve confirmed relevance.

If you've built RAG before

If you’ve worked with vector search and reranking, this will feel immediately familiar. It’s the same principle applied to your local Markdown files instead of a vector database.

Hype vs. Reality

This is the space where we try to separate the two, so let me be direct.

The hype: Agent OS is going to revolutionize how teams work with AI coding agents.

The reality: It’s a disciplined system for managing Markdown files and injecting them into prompts. There’s no magic. It works because the problem it solves is real — not because the solution is technically sophisticated.

If your team already has a culture of writing things down, Agent OS gives those documents teeth. They stop being reference material nobody reads and start being context that actually shapes AI behavior.

If your team doesn’t have that culture, Agent OS won’t create it. The /discover-standards command helps by extracting existing patterns from code — you don’t have to start from scratch — but someone still has to care about maintaining the standards files as the codebase evolves.

The sharp edge

This is currently built primarily for Claude Code, with deep slash command integration. If you’re using Cursor or Windsurf, you can reference the Markdown files directly, but you don’t get the same integrated workflow. That’s not a dealbreaker, but it’s a real gap.

What It Made Me Think About

When I built the AI copilot at VTEX, the hardest problem wasn’t the LLM. It was the data pipeline — how we ingested, chunked, and indexed documentation so the agent could retrieve relevant context at the right time.

Agent OS is solving the same problem for a different scope: instead of product documentation for a merchant copilot, it’s engineering conventions for your codebase. The architecture is simpler because the domain is smaller, but the instinct is identical.

Context quality > model quality. Always.

Every time you inject your team’s actual conventions into an agent session, you’re not making the model smarter — you’re giving it better inputs. The output difference is dramatic, and it has nothing to do with which model you’re using.

I’ve said this before and I’ll keep saying it: if you’re spending 80% of your energy choosing between GPT-4o and Claude 3.7 and 20% thinking about what context your agent actually has access to, flip those numbers.

Should You Use It?

Yes, with appropriate expectations.

If you’re a solo developer with Claude Code and you keep re-prompting the same patterns — your error handling conventions, your folder structure, your library preferences — Agent OS will save you real time. The setup cost is an hour, maybe two. The payback is immediate.

If you’re leading a team and tribal knowledge is your bottleneck for onboarding or consistent AI-assisted development, /discover-standards is genuinely useful. It forces a conversation about what your conventions actually are, which is valuable independent of any tooling.

If you’re hoping it will solve coordination problems or codebase entropy on its own — it won’t. It’s a memory layer, not a process layer.

Try It

Installation is a single script. The docs are clear. If you use Claude Code, you can have standards running in your first session today.

Building something with AI-powered dev workflows? I’m always interested in how other teams are solving the context problem. Find me on LinkedIn or GitHub.