The meeting that changed how we think about agent oversight

We were reviewing an agent’s outputs for one of our ventures and something didn’t feel right, and the agent had been running for a couple of weeks at that point and it had accumulated context and learned operational patterns and was producing work that was by all measurable standards good—relevant recommendations, accurate information retrieval, clean communication.

But one particular output caught my attention and it bothered me in a way I couldn’t immediately articulate, and the agent had made a strategic recommendation that was technically sound but directionally wrong for the venture, not “factually incorrect” wrong but “doesn’t understand the larger context of where this business is going” wrong.

That’s the moment the governance conversation got real.

The instinct that was wrong

My first instinct was guardrails, constraints, rules, the whole standard playbook, and if the agent produced an output that missed the strategic direction then we need to define the strategic direction as a constraint and the agent should check its recommendations against that constraint before delivering them.

I sat with that instinct for about a day before I realized it was the wrong response.

The problem with guardrails is they’re reactive and you encounter a specific failure and you write a rule to prevent that specific failure and you move on and over time you accumulate a growing list of rules that constrain the agent’s behavior in increasingly specific ways, and the agent becomes less capable not more capable because every guardrail is a reduction in the agent’s operational latitude.

I’ve seen this pattern in companies and a mistake happens and a policy is written and nobody ever removes the policy so after 10 years the employee handbook is 200 pages and nobody can move without bumping into a rule that was written for a situation that happened once in 2019.

The question we should have been asking

The right question wasn’t “how do we prevent this specific output?” but “why did the agent lack the context to know this was the wrong direction?”

When I reframed it that way the answer was obvious and it shifted everything, and the agent had operational context and it had memory of conversations and task history and tool outputs but what it didn’t have was strategic context—the venture’s long-term direction, the founder’s priorities, the trade-offs we’d already decided on.

The agent made a reasonable recommendation based on the information it had and that’s the critical insight, and the failure was in what information it had not in how it reasoned about it.

What we changed

Instead of adding a guardrail we added context and it changed everything, and we built a backstory injection module that runs during the Prepare stage of the cognitive pipeline and before the agent starts reasoning about anything it loads a persistent layer of strategic context specific to its organizational role and the venture it serves.

This isn’t a system prompt—system prompts are static instructions—but the backstory is dynamic organizational knowledge, and what this venture is about and what direction it’s heading and what constraints apply and what’s been decided and what’s still open and it evolves as the venture evolves.

The difference in output quality was immediate and it was striking, and not because the agent got smarter but because the agent now had the context that a human team member would have after being properly onboarded.

Guardrails vs. context

This became a design principle that shaped everything after it, and it’s one of the most important ones we’ve locked. When an agent produces a bad output, the first question is always “did the agent have the right context?” not “do we need a new rule?”

Guardrails make agents more constrained and context makes agents more capable and both have their place but the default should be context and save guardrails for genuinely dangerous actions not for quality issues that are actually information gaps.

I think about this in human terms and it’s a useful parallel, and if a new employee makes a bad recommendation you don’t write a policy and you have a conversation and you give them the background they were missing and you trust that with better information they’ll make better decisions and that’s how organizational learning actually works.

The agent version of “having a conversation” is updating the backstory and enriching the knowledge base and making sure the memory system captures the right signals and it’s slower than writing a rule but it produces better results.

The principle we locked

We documented this as a decision in our decision log and it’s become one of the load-bearing principles in the whole system, and the principle isn’t “no guardrails”—some actions need hard constraints especially anything with irreversible consequences—but the principle is “context before constraints,” and before adding a rule verify that the agent had access to the information it would have needed to make the right call and if it didn’t then fix the information gap and if it did and still made the wrong call then consider a constraint.

This sounds simple but in practice it requires discipline and the instinct after every failure is to add a rule and resisting that instinct and instead asking “what was the agent missing?” takes effort every time.

But the payoff is an agent that gets better with more context rather than an agent that gets more restricted with more rules and one of those paths leads to a genuinely useful organizational participant while the other leads to a very expensive compliance system.

We’re building the first kind.