Most teams building AI agents ask the wrong question first. They open with “What can this agent do?” — and spend months optimizing capabilities, workflows, and interfaces. Then they get closer to the launch date and realize they never asked the question that actually matters: What should this agent absolutely never do, and who decides?

That question is the difference between an AI agent that gets shut down and one that’s still running two years from now.

My team recently designed an AI agent for one of the world’s largest professional services firms — a training platform used across dozens of countries, serving thousands of professionals who operate under strict legal and regulatory requirements.

The most important conversation we had on that project wasn’t about the AI agent’s capabilities. It was about risk. Specifically, we spent significant time mapping everything the agent could not be allowed to do before we designed a single workflow.

That conversation is the reason this agent will still be running in two years.

When you are building AI agents for a firm that trains thousands of professionals under legal and regulatory scrutiny, compliance isn’t something you bolt on at the end. It’s the foundation. The stakes of hallucination, a data breach, or an off-brand response aren’t theoretical at this level — they are reputational and regulatory.

Before any design work began, we mapped the risk perimeter by answering four questions:

  • What data can the agent touch?
  • What is completely off-limits?
  • When the agent doesn’t know the answer, does it say so — or does it guess?
  • Who owns the output legally if something goes wrong?

These aren’t engineering questions. They are governance questions. And the organizations that skip them build AI agents that get shut down within months.

McKinsey’s 2026 AI Trust Maturity Survey found that only about 30% of organizations have reached advanced maturity in strategy, governance, and agentic AI controls (McKinsey, 2026). In many organizations, AI deployment is advancing faster than the governance capabilities needed to oversee it.

Here’s what those governance decisions looked like in practice:

  1. Define what the agent is allowed to do. We restricted the agent to a curated, approved document set — signed off by both legal and learning leadership. No open internet access. No inference beyond what’s in the approved corpus. If the answer isn’t in the approved content, the agent doesn’t manufacture one.
  2. Establish where human judgment is non-negotiable. Any question touching compliance, legal interpretation, or regulatory guidance triggers a hard stop. The agent flags it, declines to answer, and routes it to a human reviewer. No exceptions. No guessing on the hard stuff.
  3. Decide who sees everything. Every query, every response, every escalation is logged with a timestamp, user, and confidence score. A complete accountability trail — not just for internal oversight, but for the regulatory environment the firm operates in.

This isn’t the fastest agent I’ve ever built. But it is the one most likely to still be running in 18 months — because it was designed with longevity in mind from day one.

Enterprise AI doesn’t fail because the technology breaks. It fails because the governance breaks. The model performs. The guardrails weren’t built. Someone asks the wrong question, the agent guesses, and suddenly there’s a compliance issue, a PR problem, or both.

If you are evaluating AI agents for your organization, start with the governance layer — not the feature list. Ask your team: What should this agent absolutely never do? Who decides? Who reviews it? Who owns the outcome?

Get those answers on paper before you write a line of code. That’s the architecture decision that actually matters.