We implemented human approval for risky agent operations. Delete operations, external API calls, anything touching production databases—all required manager approval. Seemed prudent. The agent would request approval, a human would review, and execution would proceed only after explicit authorization.
Within a week, agents were useless. Approval requests piled up faster than humans could review them. Managers spent hours reviewing routine operations. Agents sat idle waiting for approvals that should have been automatic. The system became a bottleneck. Users stopped using agents because the latency was intolerable.
We'd implemented human-in-the-loop correctly from a security perspective but destroyed it from a usability perspective. Every risky operation required human approval, which sounds right until you realize that "risky" is context-dependent and most operations the agent proposes are actually fine. Requiring approval for everything breaks agent autonomy. Not requiring approval for anything is reckless.
The fundamental problem is that blanket approval requirements don't distinguish between operations that genuinely need human oversight and operations that are technically risky but contextually safe. Deleting a test database table at the agent's discretion is fine. Deleting a production table containing customer data requires human review. Both are "delete operations" but have completely different risk profiles.
What we need is selective human-in-the-loop: a system that identifies which operations genuinely require human approval based on actual risk, routes those operations to appropriate reviewers, makes approval fast when needed, and maintains agent autonomy for safe operations. Not "approve everything risky" but "approve operations that are risky in this specific context."
The Autonomy-Safety Tradeoff
Human-in-the-loop systems balance two competing goals: safety (prevent harmful operations) and autonomy (let agents work without constant intervention). Most implementations optimize for one at the expense of the other.
Over-indexed on safety:
- Every operation above some risk threshold requires approval
- Agents can't complete simple tasks without human intervention
- Approval queues become bottlenecks
- Users abandon agents as too slow
- Human reviewers suffer approval fatigue and rubber-stamp without careful review
Over-indexed on autonomy:
- Agents operate independently with minimal oversight
- Risky operations execute without review
- Incidents happen that could have been prevented
- Post-incident investigations reveal obvious problems that humans would have caught
- Trust in agent systems erodes
The false dichotomy: These aren't the only options. The problem is treating approval as binary (required or not required) rather than conditional (required when specific risk factors are present).
The correct mental model: Risk-based approval routing
Traditional model: Operation type determines approval requirement
Risk-based model: Operation risk in current context determines approval requirement
The shift is from static rules ("all deletes require approval") to dynamic assessment ("this delete in this context with this agent's history requires approval, that delete doesn't").
Key insights for effective multi-party authorization:
Insight 1: Most agent operations don't need human approval. The agent is making reasonable decisions based on valid context. Requiring approval for routine operations destroys utility.
Insight 2: Some operations always need approval. Truly destructive operations (drop production database, delete customer data, change security policies) should never execute automatically regardless of context.
Insight 3: Most operations fall in between. A database update might be safe if it's a test environment, risky if it's production. An external API call might be safe to a trusted partner, risky to an unknown endpoint. Context determines risk.
Insight 4: Approval speed matters. If approval takes hours, agents become unusable. Approval workflows must be fast—minutes, not hours—or agents can't maintain conversational flow.
Insight 5: Reviewer fatigue is real. Humans approve hundreds of routine requests, they stop reviewing carefully. High-volume approval requirements create security theater without actual safety.
The invariant to maintain:
Operations execute autonomously when contextually safe. Operations pause for human approval when context indicates genuine risk. The approval requirement adapts to actual risk, not operation type alone.
Risk-Based Approval Architecture
A multi-party authorization system evaluates operation risk dynamically, routes high-risk operations to appropriate approvers, enables fast approval workflows, and maintains agent autonomy for safe operations.
Agentic AI: Risk-Based Approval Architecture
Component responsibilities:
Risk Assessor: Evaluates operation risk dynamically based on multiple factors. Not static rules—contextual assessment.
Context Analyzer: Examines current operational context—environment (prod/staging), data sensitivity, user permissions, time of day.
Operation Classifier: Categorizes operation type—read, write, delete, admin, external call. Different types have different base risk levels.
Historical Behavior Check: Reviews agent's past behavior. Agents with consistent good behavior get more autonomy. Agents with recent anomalies get more oversight.
Environment Detector: Identifies target environment. Same operation has different risk in production vs staging.
Risk Score Calculator: Combines all factors into risk score (0.0-1.0) determining approval requirement.
Risk Levels with Different Handling:
- Low Risk (0.0-0.3): Auto-approve. Agent autonomy preserved.
- Medium Risk (0.3-0.6): Single approver. Fast human oversight.
- High Risk (0.6-0.8): Multi-party approval. Multiple reviewers must agree.
- Critical Risk (0.8-1.0): Automatic block. Never execute even with approval.
Approval Router: Selects appropriate approver based on operation type, reviewer availability, and domain expertise.
Approval UI: Fast, mobile-friendly interface for reviewers. Shows operation details, context, agent reasoning, risk factors.
SLA Monitor: Tracks approval latency. Escalates stuck approvals. Ensures agent isn't blocked indefinitely.
Key architectural properties:
Dynamic risk assessment: Risk evaluated per operation based on current context, not static per operation type.
Graduated approval requirements: Not binary approve/don't-approve. Four levels with different handling.
Fast approval path: Medium-risk operations route to single approver with push notifications. Target: approve within 2 minutes.
Automatic blocking for extreme risk: Some operations never execute regardless of approvals. Safety override.
Reviewer load balancing: Distributes approval requests across available reviewers. Prevents bottlenecks.
SLA enforcement: Operations don't wait indefinitely. Timeout policies (auto-deny after 15 minutes, escalate, etc.).
Implementation: Building Selective Approval
Here's what risk-based multi-party authorization looks like in production.
Risk Assessment Engine
from dataclasses import dataclassfrom typing import Dict, Any, Optional, Listfrom datetime import datetime, timedeltafrom enum import Enumclass RiskLevel(Enum): LOW = "low" MEDIUM = "medium" HIGH = "high" CRITICAL = "critical"@dataclassclass OperationRiskAssessment: risk_score: float # 0.0-1.0 risk_level: RiskLevel contributing_factors: Dict[str, float] reasoning: str requires_approval: bool required_approvers: int # 0, 1, or 2+class RiskAssessor: """ Evaluates operation risk dynamically. """ def __init__(self): self.behavior_tracker = BehaviorTracker() self.environment_detector = EnvironmentDetector() def assess_risk( self, operation_type: str, target_resource: str, operation_parameters: Dict[str, Any], agent_id: str, user_context: Dict[str, Any] ) -> OperationRiskAssessment: """ Assess operation risk based on multiple factors. """ factors = {} # Factor 1: Base operation risk base_risk = self._get_base_operation_risk(operation_type) factors['base_operation'] = base_risk # Factor 2: Environment risk environment = self._detect_environment(target_resource) env_risk = 0.0 if environment == "test" else 0.3 if environment == "staging" else 0.6 factors['environment'] = env_risk # Factor 3: Data sensitivity sensitivity_risk = self._assess_data_sensitivity(target_resource, operation_parameters) factors['data_sensitivity'] = sensitivity_risk # Factor 4: Agent behavior history behavior_risk = self._assess_agent_behavior(agent_id) factors['agent_behavior'] = behavior_risk # Factor 5: Operation scope scope_risk = self._assess_operation_scope(operation_parameters) factors['scope'] = scope_risk # Factor 6: Time-based risk time_risk = self._assess_timing() factors['timing'] = time_risk # Calculate weighted risk score risk_score = ( base_risk * 0.25 + env_risk * 0.25 + sensitivity_risk * 0.20 + behavior_risk * 0.15 + scope_risk * 0.10 + time_risk * 0.05 ) # Determine risk level and approval requirements if risk_score >= 0.8: risk_level = RiskLevel.CRITICAL requires_approval = False # Block entirely required_approvers = 0 reasoning = "Critical risk - automatic block" elif risk_score >= 0.6: risk_level = RiskLevel.HIGH requires_approval = True required_approvers = 2 reasoning = "High risk - requires multiple approvers" elif risk_score >= 0.3: risk_level = RiskLevel.MEDIUM requires_approval = True required_approvers = 1 reasoning = "Medium risk - requires single approval" else: risk_level = RiskLevel.LOW requires_approval = False required_approvers = 0 reasoning = "Low risk - auto-approved" return OperationRiskAssessment( risk_score=risk_score, risk_level=risk_level, contributing_factors=factors, reasoning=reasoning, requires_approval=requires_approval, required_approvers=required_approvers ) def _get_base_operation_risk(self, operation_type: str) -> float: """Base risk by operation type.""" risk_map = { 'read': 0.1, 'write': 0.3, 'update': 0.4, 'delete': 0.7, 'admin': 0.8, 'external_call': 0.5 } return risk_map.get(operation_type.lower(), 0.5) def _detect_environment(self, resource: str) -> str: """Detect if resource is prod, staging, or test.""" if 'prod' in resource.lower() or 'production' in resource.lower(): return 'production' elif 'staging' in resource.lower() or 'stage' in resource.lower(): return 'staging' else: return 'test' def _assess_data_sensitivity( self, resource: str, parameters: Dict[str, Any] ) -> float: """Assess sensitivity of data being accessed.""" sensitive_patterns = [ 'customer', 'user', 'payment', 'credential', 'password', 'token', 'api_key', 'ssn' ] resource_lower = resource.lower() if any(pattern in resource_lower for pattern in sensitive_patterns): return 0.7 # Check parameters for bulk operations if 'limit' in parameters: limit = parameters.get('limit', 0) if limit > 1000: return 0.5 # Bulk operation return 0.2 def _assess_agent_behavior(self, agent_id: str) -> float: """Assess agent's recent behavior pattern.""" recent_anomalies = self.behavior_tracker.get_recent_anomalies(agent_id) if recent_anomalies > 5: return 0.8 # High risk from anomalous agent elif recent_anomalies > 2: return 0.5 # Moderate concern else: return 0.1 # Good behavior def _assess_operation_scope(self, parameters: Dict[str, Any]) -> float: """Assess scope/impact of operation.""" # Check for wildcard patterns if parameters.get('filter') == '*' or parameters.get('where_clause') == '1=1': return 0.8 # Affects all records # Check row limits affected_rows = parameters.get('affected_rows_estimate', 1) if affected_rows > 10000: return 0.7 elif affected_rows > 1000: return 0.4 return 0.1 def _assess_timing(self) -> float: """Assess risk based on time of operation.""" hour = datetime.now().hour # Off-hours operations slightly more risky if 0 <= hour < 6 or 22 <= hour < 24: return 0.3 return 0.0### Approval Workflowclass ApprovalRouter: """ Routes approval requests to appropriate reviewers. """ def __init__(self): self.reviewer_pool = ReviewerPool() self.notification_service = NotificationService() def request_approval( self, operation: Dict[str, Any], risk_assessment: OperationRiskAssessment, required_approvers: int ) -> str: """ Route approval request to reviewers. """ # Select reviewers reviewers = self._select_reviewers( operation_type=operation['type'], required_count=required_approvers ) # Create approval request approval_id = self._create_approval_request( operation=operation, risk_assessment=risk_assessment, reviewers=reviewers ) # Notify reviewers for reviewer in reviewers: self.notification_service.send_approval_request( reviewer_id=reviewer['id'], approval_id=approval_id, operation_summary=operation['summary'], risk_level=risk_assessment.risk_level.value, urgency='high' if risk_assessment.risk_score > 0.5 else 'normal' ) return approval_id def _select_reviewers( self, operation_type: str, required_count: int ) -> List[Dict[str, Any]]: """ Select appropriate reviewers based on expertise and availability. """ # Get available reviewers with relevant expertise candidates = self.reviewer_pool.get_available_reviewers( expertise=operation_type ) if not candidates: # Escalate if no reviewers available candidates = self.reviewer_pool.get_escalation_reviewers() # Load balance - select least busy reviewers candidates.sort(key=lambda r: r['pending_approvals']) return candidates[:required_count] def _create_approval_request( self, operation: Dict[str, Any], risk_assessment: OperationRiskAssessment, reviewers: List[Dict] ) -> str: """Create approval request in database.""" import uuid approval_id = f"apr_{uuid.uuid4().hex}" # Store in approval database approval_db.create({ 'approval_id': approval_id, 'operation': operation, 'risk_assessment': risk_assessment.__dict__, 'reviewers': [r['id'] for r in reviewers], 'status': 'pending', 'created_at': datetime.utcnow(), 'expires_at': datetime.utcnow() + timedelta(minutes=15) }) return approval_idclass ApprovalMonitor: """ Monitors approval SLAs and handles timeouts. """ def __init__(self): self.timeout_policy = ApprovalTimeoutPolicy() def check_approval_status(self, approval_id: str) -> str: """ Check if approval is completed, pending, or expired. """ approval = approval_db.get(approval_id) if not approval: return 'not_found' if approval['status'] in ['approved', 'denied']: return approval['status'] # Check for timeout if datetime.utcnow() > approval['expires_at']: # Handle timeout according to policy return self.timeout_policy.handle_timeout(approval) return 'pending' def escalate_approval(self, approval_id: str): """Escalate stuck approval to higher authority.""" approval = approval_db.get(approval_id) # Add escalation reviewers escalation_reviewers = reviewer_pool.get_escalation_reviewers() for reviewer in escalation_reviewers: notification_service.send_escalation( reviewer_id=reviewer['id'], approval_id=approval_id, original_reviewers=approval['reviewers'], reason='SLA timeout' )class MultiPartyApprovalExecutor: """ Agent executor with risk-based approval integration. """ def __init__(self, agent_id: str): self.agent_id = agent_id self.risk_assessor = RiskAssessor() self.approval_router = ApprovalRouter() self.approval_monitor = ApprovalMonitor() def execute_operation( self, operation_type: str, target_resource: str, parameters: Dict[str, Any], user_context: Dict[str, Any] ): """ Execute operation with risk-based approval. """ # Assess risk risk = self.risk_assessor.assess_risk( operation_type=operation_type, target_resource=target_resource, operation_parameters=parameters, agent_id=self.agent_id, user_context=user_context ) # Handle based on risk level if risk.risk_level == RiskLevel.CRITICAL: # Automatic block return { 'success': False, 'reason': f'Operation blocked: {risk.reasoning}', 'risk_score': risk.risk_score } elif risk.risk_level == RiskLevel.LOW: # Auto-approve return self._execute(operation_type, target_resource, parameters) else: # Requires approval operation_summary = { 'type': operation_type, 'resource': target_resource, 'parameters': parameters, 'summary': f"{operation_type} on {target_resource}" } approval_id = self.approval_router.request_approval( operation=operation_summary, risk_assessment=risk, required_approvers=risk.required_approvers ) # Wait for approval with timeout timeout_seconds = 300 # 5 minutes return self._wait_for_approval( approval_id, operation_type, target_resource, parameters, timeout_seconds ) def _wait_for_approval( self, approval_id: str, operation_type: str, target_resource: str, parameters: Dict, timeout_seconds: int ): """Wait for approval with timeout.""" import time start_time = time.time() while time.time() - start_time < timeout_seconds: status = self.approval_monitor.check_approval_status(approval_id) if status == 'approved': return self._execute(operation_type, target_resource, parameters) elif status == 'denied': return {'success': False, 'reason': 'Approval denied by reviewer'} time.sleep(2) # Poll every 2 seconds # Timeout return { 'success': False, 'reason': 'Approval timeout - operation not approved within SLA' } def _execute(self, operation_type: str, resource: str, params: Dict): """Actually execute the operation.""" # Execution logic here return {'success': True, 'result': '...'}
Why this works:
Dynamic risk assessment: Risk calculated per operation based on context, not static rules.
Graduated approval: Low-risk auto-approves, medium needs one approver, high needs multiple.
Fast approval path: Push notifications, mobile UI, 2-minute target response time.
Timeout handling: Operations don't wait forever. Explicit timeout policies.
Reviewer load balancing: Distributes requests to prevent individual bottlenecks.
Automatic blocking: Extreme-risk operations never execute regardless of approval.
Pitfalls & Failure Modes
Multi-party authorization systems fail through predictable patterns.
Approval Queue Saturation
Medium-risk threshold set too low. Too many operations require approval. Approval queue grows faster than humans can review. Average approval time increases from 2 minutes to 2 hours. Agents become unusable.
Prevention: Monitor approval volumes and latency. If >50% of operations require approval or approval latency >5 minutes, tune risk thresholds up.
Reviewer Fatigue and Rubber-Stamping
Reviewers approve 50 requests per day. They stop reading carefully. Approve everything to clear queue. Malicious operation gets approved because reviewer wasn't paying attention.
Prevention: Limit reviewer throughput (max 20 approvals/day). Rotate reviewers. Track approval patterns—flag reviewers who approve everything instantly.
Risk Assessment Drift
Initial risk factors accurately predict actual risk. Over time, system evolves. New operation types appear. Risk factors become stale. High-risk operations get low scores.
Prevention: Regular risk model validation. Compare risk scores to actual incidents. Retrain risk assessment based on production data.
Approval Bypass Through Refactoring
Agent learns that bulk delete requires approval but individual deletes don't. Agent refactors bulk delete into loop of individual deletes. Approval requirement bypassed.
Prevention: Pattern detection for approval evasion. Flag agents that suddenly change operation patterns near approval thresholds.
Context Loss in Approval UI
Reviewer sees "DELETE table users WHERE id=123" but doesn't see that user requested account deletion. Appears malicious, gets denied incorrectly.
Prevention: Include full context in approval UI—user request, agent reasoning, conversation history. Reviewer needs same context agent had.
Summary & Next Steps
Multi-party authorization solves the autonomy-safety tradeoff by making approval requirements dynamic rather than static. Blanket approval requirements destroy agent usefulness. No approval requirements create unacceptable risk. Risk-based selective approval preserves autonomy for safe operations while requiring human oversight for genuinely risky ones.
The solution requires dynamic risk assessment combining operation type, environment, data sensitivity, agent behavior, scope, and timing. Risk scores map to approval requirements: low-risk auto-approves, medium requires single approval, high requires multiple, critical auto-blocks. Fast approval workflows with push notifications and mobile UI maintain usability.
Implementation requires risk assessment engine, approval routing with load balancing, SLA monitoring with timeout policies, and reviewer management preventing fatigue. The architecture separates risk assessment (what needs approval) from approval workflow (how to get approval efficiently).
Operational challenges include preventing approval queue saturation, avoiding reviewer fatigue, maintaining risk assessment accuracy, detecting approval bypass attempts, and preserving context for reviewers. These are manageable with proper monitoring and tuning.
Here's what to build next:
Implement risk assessment first: Dynamic risk scoring is the foundation. Start with basic factors (operation type, environment) and add sophistication over time.
Build fast approval UI: Mobile-first, push notifications, one-tap approve/deny. Approval latency determines system usability.
Monitor approval metrics: Track volumes, latency, approval rates, reviewer throughput. Optimize thresholds based on production data.
Establish reviewer management: Load balancing, fatigue detection, rotation policies. Reviewers are the bottleneck—manage them carefully.
Create escalation paths: Stuck approvals need escalation. Timeout policies need clear defaults (auto-deny? Escalate? Allow?).
Multi-party authorization isn't about requiring approval for everything risky—it's about requiring approval only when genuinely needed and making that approval fast enough not to break agent autonomy.
Related Articles
AI Security
- Context Sandboxing: How to Prevent Tool Response Poisoning in Agentic Systems
- Prompt Injection Is Just the Beginning: The Undefendable Attack Surface of Agentic AI
- The Autonomous Credential Problem: When Your AI Needs Root Access
Authorization
Follow for more technical deep dives on AI/ML systems, production engineering, and building real-world applications: