When should an AI agent require human-in-the-loop approval?

Question

Accepted Answer

An AI agent should require human-in-the-loop approval for high-stakes, irreversible actions, ambiguous edge cases, situations with low agent confidence, and when regulatory requirements mandate human review. This is a critical aspect of the NIST AI RMF Govern function and ISO/IEC 42001 human oversight requirements.

Concrete controls for human-in-the-loop approval include:

Pre-action approval gates should be implemented for high-stakes actions such as financial transactions above a threshold, communications to external parties, irreversible operations, or actions affecting many users. The approval interface should present the agent's proposed action, its reasoning, the data considered, and the policy reason for approval. To prevent approval fatigue (OWASP LLM Top 10 L5, L6), risk-based routing can batch low-risk approvals for asynchronous review while surfacing high-risk ones in real-time.
Escalation policies must be architected to route specific situations to qualified humans, such as a medical diagnosis agent escalating differential diagnoses to a physician or a financial agent escalating positions above a threshold to a licensed advisor. This addresses the risk of delegated approval to under-qualified reviewers (OWASP LLM Top 10 L6).
Timeout-defaults-to-deny should be the default behavior for approval windows to mitigate time-based attacks (OWASP LLM Top 10 L3, L6) where attackers might stall human review to force an auto-approval.
Multi-party approval should be required for catastrophic-risk actions, ensuring that no single human or AI principal can authorize the most consequential operations alone. This also helps mitigate override misuse (OWASP LLM Top 10 L6).
Budget-based autonomy can serve as a backstop, halting an agent and requiring human review if it exceeds a configured budget of tokens, dollars, calls, or affected records, regardless of the action's risk level.
Structured logging of all human decisions, including approvals and overrides, should be integrated into the same audit stream as agent actions to ensure traceability and accountability (OWASP LLM Top 10 L6). This helps address audit blind spots in human decisions (OWASP LLM Top 10 L5, L6).

When should an AI agent require human-in-the-loop approval?

How does your AI agent score?

Related questions