"An Agent Opened a PR — Who Merges It?"
This is the question we hear most often in AI agent adoption workshops. "If an agent writes code, who reviews the PR? If there's an incident, who's accountable?"
Teams that adopt agents without answering this abandon adoption within 3-6 months, or spend more time cleaning up agent mistakes than the agents saved.
You must design the trust hierarchy before deployment. Below are 5 principles.
Principle 1 — Privilege Separation
Rule: Separate system access by agent role. Apply the principle of least privilege.
Why: A single omnipotent agent has unlimited blast radius on failure. Splitting privileges by role contains damage to one domain.
In practice:
- Backend agent: DB read + code write, no production deploy
- Frontend agent: UI code + design token read, no backend code
- Data agent: read replica only, no original DB writes
At the tool layer, grant via per-MCP-server permissions.
Principle 2 — Audit Logs
Rule: Record every agent action. Who, when, and why must be traceable.
Why: If you can't answer "Why is this code structured this way?", trust in agent output drops to zero. Auditability is a precondition of trust.
In practice:
- Record every MCP tool call (which tool, which arguments, which response)
- Preserve the rationale prompts behind each decision (task decomposition, prioritization)
- Log human intervention points and reasons
Marblo's kanban board persists per-card activity logs.
Principle 3 — Rollback Paths
Rule: Every agent decision must have a reversible path.
Why: Assuming AI is 100% accurate is dangerous. Humans make mistakes too. The difference is how fast you can roll back.
In practice:
- Code changes → Git-revertable commits
- DB changes → migrations always have down functions
- External API calls (email, payments) → dry-run mode first
In particular, irreversible actions like payments and customer notifications must pass through a human approval gate.
Principle 4 — Measurable KPIs
Rule: Define core KPIs for the agent adoption and measure them on a cadence.
Why: "It feels like AI is helping, is it though?" is a measurement question, not a feeling. Without KPIs, you can't justify adoption or guide improvement.
Example KPIs:
| Category | Metric |
|---|---|
| Cost | Monthly token cost / split by model |
| Accuracy | Auto-merge rate (PRs merged without human edits) |
| Time | Average task processing time |
| Stability | Rollback rate / incident frequency |
| Satisfaction | Internal user NPS |
Review these monthly in a governance meeting.
Principle 5 — Team Training
Rule: Agents are both tool and colleague. Train your team to collaborate with them.
Why: Even the best governance policy is meaningless if people don't use it. Teams must learn the new way of working with agents.
In practice:
- Agentic teamwork workshops — judgment training on what to delegate to agents vs. do yourself
- Prompt review culture — regular sessions reviewing peers' prompts
- Incident retrospectives — analyze agent mistakes together
All 5 in One Place — Marblo + Consulting
Implementing all 5 principles from scratch is non-trivial. Marblo provides these out of the box:
- Per-agent-role privilege separation (Principle 1)
- Automatic audit logs of every action (Principle 2)
- Card-level rollback (Principle 3)
- Cost / time / stability dashboards (Principle 4)
Principle 5 (team training) requires more than tooling. In-house Adoption Consulting co-designs all 5 principles for your environment and trains your team alongside.
Closing
Governance is not an add-on for later. If you don't design it from day one, an incident hits within 6 months, and rebuilding governance reactively is far more expensive.
The most important job for PMs and CTOs in the agent era isn't "what should we automate" — it's "how do we design the trust hierarchy for that automation".