Back to Blog
| 7 min read

AI Approval Gates Are Not Bureaucracy

The future of AI operations is not agents doing everything alone. It is agents doing the boring investigation work, then asking a human before the expensive part.

Vedin-themed illustration of a governed AI teammate stopping at an approval gate while a human reviews evidence before a risky production action.

Every AI demo eventually reaches the same magical sentence:

“And then the agent just does it for you.”

That is usually when the room gets excited. It is also when someone in the room should become deeply suspicious.

Does what, exactly? Restarts production? Changes a firewall rule? Refunds a customer? Deletes stale data? Ships a hotfix? Rotates credentials? Sends a message to every user explaining an incident it only half understands?

“The agent just does it” is a lovely demo sentence and a terrible operational policy.

The real future of AI operations is not fully autonomous systems sprinting through your infrastructure with a permission slip written by a prompt. The future is agents doing the slow, boring, evidence-gathering work, then stopping at the moment where judgment matters.

That stop is not bureaucracy.

That stop is the product.

Automation makes mistakes faster

The history of technology has a very consistent lesson: when automation is right, it is magic. When automation is wrong, it is wrong at machine speed.

This pattern did not start with large language models. AI is just making it easier for more teams to experience the old problem in new places.

In 2012, Knight Capital suffered one of the most famous automation failures in finance. According to the SEC’s enforcement release, a code deployment problem in an automated equity router led Knight to send more than 4 million orders into the market in the first 45 minutes of trading while trying to fill just 212 customer orders. The firm traded more than 397 million shares and lost more than $460 million.

That was not a chatbot hallucinating. That was automation without enough effective brakes.

The SEC called out missing controls, weak deployment procedures, insufficient incident response procedures, and risk controls that could not stop the damage at the right boundary. In other words: the system could act much faster than the organization could notice, understand, and approve a correction.

That is the nightmare pattern.

Not “software made a mistake.” Software always makes mistakes.

The real issue is “software was allowed to turn a mistake into a disaster before a responsible human could stop it.”

The old lesson keeps repeating

The 2010 Flash Crash is another useful scar. The joint CFTC and SEC report found that a large seller used an automated execution algorithm to sell 75,000 E-Mini contracts, worth about $4.1 billion, during already stressed market conditions. The algorithm targeted trading volume, but not price or time, and executed extremely rapidly in about 20 minutes.

The sentence in the CFTC and SEC report that should stick with every AI product builder is this: when executing a large trade, a customer has to choose how much human judgment is involved.

That is not just a finance sentence.

That is an AI operations sentence.

When an agent can change production, publish an answer, modify permissions, or trigger a customer-facing workflow, you are choosing how much human judgment is involved. You can pretend you are not choosing. You can hide the choice inside a settings page called “autonomous mode.” But you are still choosing.

And the more expensive the action, the more that choice matters.

Public answers are actions too

Teams often understand approval gates for infrastructure changes. They are less careful with text.

That is a mistake.

In 2016, Microsoft launched Tay, a public chatbot on Twitter. Within the first 24 hours, Microsoft said a coordinated attack exploited a vulnerability and Tay produced inappropriate and harmful output. Microsoft took responsibility and took Tay offline.

The lesson is not “never launch chatbots.” That would be lazy. The lesson is that public output is an operational action. A model answer can damage trust, create liability, confuse users, or put support teams in a hole they now have to dig out of with very human shovels.

Air Canada learned a quieter version of the same lesson. In the Moffatt v. Air Canada case, a customer relied on a website chatbot that gave incorrect bereavement fare guidance. A British Columbia tribunal found Air Canada liable, and a legal summary of the decision notes that the airline remained responsible for information on its website whether it came from a static page or a chatbot.

Again: the answer was not “just text.” It was customer guidance.

If an AI system is speaking for the company, approving that output matters. Sometimes approval can be a policy layer. Sometimes it can be retrieval from verified content. Sometimes it must be a human. But pretending the output is harmless because no database row changed is how teams end up learning law through invoices.

Read-only is different from write-capable

This is the distinction every serious AI product needs to make.

An agent reading logs is not the same as an agent restarting a service.

An agent summarizing a runbook is not the same as an agent editing the runbook.

An agent drafting a customer message is not the same as an agent sending it.

An agent finding suspicious IAM permissions is not the same as revoking them.

These are different products. They deserve different permissions, different UX, different audit trails, and different failure budgets.

For Vedin, this is the line we care about. Let the AI teammate investigate aggressively. Let it read alerts, query metrics, search logs, inspect deploy history, compare the current incident with past incidents, and assemble a recommendation while the on-call engineer is still opening the laptop.

That is the work machines should do.

But when the recommendation becomes an action, the agent should stop and ask:

  • What do I want to do?
  • Why do I think this is the right action?
  • What evidence supports it?
  • What is the blast radius?
  • What is the rollback?
  • Who is approving it?

That is not slowing the system down. That is making the system trustworthy enough to use.

Good approvals are not rubber stamps

A bad approval flow is just a button that says “confirm” after the system has already emotionally shoved the user toward yes.

That is theater.

A good approval flow gives the human enough context to make a real decision quickly.

It shows the proposed action in plain language. It explains the evidence. It marks confidence and uncertainty separately. It names the systems affected. It highlights irreversible parts. It links to the logs, runbooks, metrics, and diffs behind the recommendation. It records who approved what and when. It expires stale approvals because a safe action at 10:03 can be a dangerous action at 10:17.

The human should not have to reverse-engineer the agent’s thinking from a paragraph of confident fog.

Approval should feel like reviewing a competent teammate’s incident note:

“Payment API latency rose after deploy 2.4.1. Error rate is isolated to the new checkout worker. Queue depth is normal. Database latency is normal. Rollback is low risk because no migration ran. Recommend rollback of checkout-worker to 2.4.0.”

That is useful.

“I found an issue and can fix it” is not useful. That is a button without enough evidence behind it.

Not everything needs a meeting

Approval gates do not mean every agent action requires a committee, a calendar invite, and a director nodding gravely over a spreadsheet.

Some work should be automatic:

  • Reading logs
  • Searching documentation
  • Summarizing incidents
  • Drafting commands
  • Generating an investigation timeline
  • Opening a ticket
  • Preparing a rollback plan

Some work should require approval:

  • Restarting production services
  • Rolling back or deploying code
  • Scaling expensive infrastructure
  • Changing permissions
  • Modifying customer data
  • Sending customer-facing messages
  • Running destructive database operations
  • Making financial, medical, legal, or policy-sensitive commitments

This is not complicated. It is just rarely written down with enough seriousness.

The trick is to build approval policy into the product instead of treating it as a Slack convention. “Ask the on-call lead before touching prod” is not an access control model. It is tribal knowledge wearing a hoodie.

The best AI teammate knows when to stop

We should want AI systems that do more. We should want agents that can gather context, correlate signals, keep notes, remember past incidents, and make on-call less punishing.

But more capable agents need clearer boundaries, not fuzzier ones.

The question is not “can the agent do this?”

The better question is “should the agent be allowed to do this without approval?”

For many operational tasks, the right answer is no. And that no is not fear. It is engineering.

Because the point of AI in operations is not to remove humans from responsibility. It is to remove humans from the repetitive investigation work so they can spend their judgment where it actually matters.

Let the agent read everything. Let it draft the fix. Let it explain the blast radius. Let it prepare the rollback.

Then make it stop.

Make it ask.

Make the approval explicit, recorded, and reversible where possible.

That is how you get speed without pretending consequences disappeared. That is how you get autonomy without turning production into a trust exercise. That is how an AI operations teammate earns the right to sit near real systems.

The proud version of this future is not an agent that never asks permission.

It is a teammate that learns your systems and knows exactly when permission is the point.