An agent needed to update a Lambda function's environment variables. Standard practice: give it an IAM role with lambda:UpdateFunctionConfiguration permission. The agent got the permission. It also got the ability to update any Lambda function in the account, change timeout settings, modify memory allocations, and reconfigure VPC settings. We wanted narrow authorization for one specific operation. IAM gave us broad capability across all operations.
This is the fundamental mismatch between traditional authorization systems and agent requirements. IAM policies, RBAC, and permission systems were designed for deterministic actors making explicit requests. An administrator clicking "Update Environment Variables" in the console. A CI/CD pipeline executing a predefined deployment script. These systems grant persistent permissions scoped to resources and actions.
Agents don't fit this model. An agent doesn't make explicit requests—it makes probabilistic decisions based on runtime context. You can't predict which Lambda it will target or what changes it will propose. The operation emerges from the LLM's reasoning process, not from predefined code paths. Traditional authorization systems force a choice: grant broad permissions and accept the risk, or deny access and break agent functionality.
Neither option is acceptable in production. Broad permissions mean compromised agents can cause unlimited damage. Denied access means agents can't complete tasks. We need fine-grained authorization that works with non-deterministic decision-making—permissions scoped not just to resources and actions, but to specific operations with specific parameters at specific times.
The solution is capability tokens: single-use authorization grants that encode exactly what is allowed. Instead of persistent IAM permissions, agents receive ephemeral tokens that permit one specific operation. "Update Lambda function X's environment variable Y to value Z, valid for the next 5 minutes." The token itself is the authorization. No ambient authority, no persistent credentials, no need to trust agent decision-making.
The Authorization Model Mismatch
Traditional authorization has three core assumptions that agents violate.
Assumption 1: Actors make predictable requests
Traditional systems assume you can enumerate what an actor might request. A user might read files in their home directory. A service might query a specific database table. The set of possible requests is knowable in advance, so you can write policies that grant or deny them.
Agents break this. An agent's requests emerge from LLM inference over runtime context. You can't enumerate possible requests because they depend on inputs you don't control. The agent might query the database, or it might read a file, or it might call an API—the decision depends on what the user asked and what context the agent has seen.
Assumption 2: Permissions can be static
Traditional systems grant permissions at deployment time or login time. Those permissions persist for the session duration. They're static because the actor's needs are predictable. A database service always needs database access. A web server always needs to read static assets.
Agents need dynamic permissions that change per operation. One tool call might need read-only database access. The next might need write access to a different table. The third might need no database access at all. Static permissions either grant too much (security risk) or too little (broken functionality).
Assumption 3: The authorization system trusts the actor's intent
Traditional systems verify identity and check permissions, then trust the actor to use those permissions appropriately. If you have s3:PutObject permission, the system trusts you to put the right objects in the right places. Authorization is binary: you can or you can't.
Agents can't be trusted with intent. An LLM with s3:PutObject permission might put objects correctly 99% of the time. The 1% where it's influenced by prompt injection or context poisoning, it puts malicious objects. You can't trust probabilistic actors to use permissions appropriately.
The correct mental model: authorization as contracts, not capabilities
Traditional authorization: "You have the capability to perform these actions on these resources."
Capability-based authorization: "You are authorized to perform this specific operation with these specific parameters, once, within this time window."
The shift is from persistent ambient authority to ephemeral explicit authorization. Instead of asking "Can this agent access Lambda?" ask "Is this agent authorized to perform this specific Lambda operation right now?"
The invariant to maintain:
No operation executes with ambient authority. Every operation requires a capability token that explicitly authorizes that specific operation. Tokens are single-use, time-limited, and parameter-bound.
The trust boundary:
Don't trust the agent's decisions. Trust capability tokens that were issued by a validation system that verified the operation should be allowed. The agent proposes operations. The validation system issues tokens. Only tokens execute.
Capability Token Architecture
A capability-based authorization system separates token issuance from token use. Agents propose operations, a capability service issues tokens if the operation is authorized, and execution layers accept only valid tokens.
Agentic AI: Capability Token Architecture
Component responsibilities:
Agent Core (purple): Makes decisions, proposes operations. Has no execution capability. Cannot perform operations without tokens.
Operation Proposal (purple): Structured description of what the agent wants to do. Includes operation type, target resources, parameters, and reasoning.
Capability Service (yellow): The authorization authority. Validates proposed operations and issues capability tokens for approved operations. This is the trust boundary.
Validation Engine (yellow/teal): Multi-stage validation combining policy checks, context analysis, and scope validation. Determines if an operation should be authorized.
Token Generator (green): Creates capability tokens for authorized operations. Tokens are cryptographically signed, time-limited, and parameter-bound.
Capability Token (green): The authorization artifact. Contains operation details, constraints, expiration, and digital signature. Possession of a valid token authorizes execution.
Token Validator (yellow): Verifies token authenticity, checks expiration, validates parameters, and ensures single-use. Execution only proceeds with valid tokens.
Execute Operation (green): Actually performs the operation using the authorization granted by the token. Constrained by token parameters.
Key architectural properties:
Separation of authorization from execution: Capability service authorizes operations. Execution layer enforces authorization. The agent has neither authority.
Tokens as authorization: The token itself proves authorization. No need to query permission databases or maintain session state at execution time.
Single-use tokens: Each token authorizes exactly one operation execution. After use, the token is marked as consumed and cannot be reused.
Parameter binding: Tokens encode operation parameters. Execution must match token parameters exactly. This prevents token reuse for different operations.
Time-limited validity: Tokens expire automatically. Compromised tokens have bounded lifetime. Agents can't stockpile authority.
Cryptographic integrity: Tokens are digitally signed. Tampering is detectable. Only the capability service can issue valid tokens.
Comprehensive audit: Token issuance, validation, and use are logged. Complete audit trail for every operation.
Implementation: Building Capability Tokens
Here's what capability-based authorization looks like in production with real agent frameworks.
Capability Token Structure
from dataclasses import dataclassfrom typing import Dict, Any, Optionalfrom datetime import datetime, timedeltaimport hmacimport hashlibimport jsonimport secrets@dataclassclass CapabilityToken: """ Self-contained authorization for a specific operation. Token possession proves authorization. """ # Core identity token_id: str # Unique identifier issued_at: datetime expires_at: datetime # Operation specification operation_type: str # "lambda:update_env", "s3:put_object", etc. resource_arn: str # Specific resource this token authorizes parameters: Dict[str, Any] # Exact parameters allowed # Constraints allowed_agent_id: str # Which agent can use this token single_use: bool = True max_cost_dollars: Optional[float] = None # Security signature: str = "" # HMAC signature for integrity nonce: str = "" # Prevents replay attacks def to_dict(self) -> Dict[str, Any]: """Serialize token for transmission.""" return { 'token_id': self.token_id, 'issued_at': self.issued_at.isoformat(), 'expires_at': self.expires_at.isoformat(), 'operation_type': self.operation_type, 'resource_arn': self.resource_arn, 'parameters': self.parameters, 'allowed_agent_id': self.allowed_agent_id, 'single_use': self.single_use, 'max_cost_dollars': self.max_cost_dollars, 'nonce': self.nonce } @classmethod def from_dict(cls, data: Dict[str, Any]) -> 'CapabilityToken': """Deserialize token.""" return cls( token_id=data['token_id'], issued_at=datetime.fromisoformat(data['issued_at']), expires_at=datetime.fromisoformat(data['expires_at']), operation_type=data['operation_type'], resource_arn=data['resource_arn'], parameters=data['parameters'], allowed_agent_id=data['allowed_agent_id'], single_use=data.get('single_use', True), max_cost_dollars=data.get('max_cost_dollars'), nonce=data['nonce'], signature=data.get('signature', '') )class CapabilityService: """ Issues and validates capability tokens. The authorization authority for the system. """ def __init__(self, signing_key: bytes): self.signing_key = signing_key self.token_registry = TokenRegistry() self.policy_engine = PolicyEngine() self.audit_logger = AuditLogger() def issue_token( self, operation_type: str, resource_arn: str, parameters: Dict[str, Any], agent_id: str, validity_seconds: int = 300 ) -> Optional[CapabilityToken]: """ Issue a capability token if the operation is authorized. """ # Step 1: Policy validation # Is this operation allowed at all? policy_result = self.policy_engine.check_authorization( operation_type=operation_type, resource_arn=resource_arn, parameters=parameters, agent_id=agent_id ) if not policy_result.authorized: self.audit_logger.log_denial( operation_type=operation_type, resource_arn=resource_arn, agent_id=agent_id, reason=policy_result.denial_reason ) return None # Step 2: Generate token token = CapabilityToken( token_id=self._generate_token_id(), issued_at=datetime.utcnow(), expires_at=datetime.utcnow() + timedelta(seconds=validity_seconds), operation_type=operation_type, resource_arn=resource_arn, parameters=parameters, allowed_agent_id=agent_id, single_use=True, max_cost_dollars=policy_result.max_cost, nonce=secrets.token_hex(16) ) # Step 3: Sign token token.signature = self._sign_token(token) # Step 4: Register token self.token_registry.register(token) # Step 5: Audit self.audit_logger.log_issuance( token_id=token.token_id, operation_type=operation_type, resource_arn=resource_arn, agent_id=agent_id, expires_at=token.expires_at ) return token def validate_token( self, token: CapabilityToken, agent_id: str, actual_parameters: Dict[str, Any] ) -> TokenValidationResult: """ Validate a capability token before execution. """ # Check 1: Signature integrity if not self._verify_signature(token): return TokenValidationResult( valid=False, reason="Invalid signature - token may be tampered" ) # Check 2: Expiration if datetime.utcnow() > token.expires_at: return TokenValidationResult( valid=False, reason=f"Token expired at {token.expires_at}" ) # Check 3: Agent authorization if token.allowed_agent_id != agent_id: return TokenValidationResult( valid=False, reason=f"Token issued for agent {token.allowed_agent_id}, not {agent_id}" ) # Check 4: Single-use enforcement if token.single_use and self.token_registry.is_used(token.token_id): return TokenValidationResult( valid=False, reason="Token already used (single-use token)" ) # Check 5: Parameter matching # Execution parameters must exactly match token parameters if actual_parameters != token.parameters: return TokenValidationResult( valid=False, reason=f"Parameter mismatch. Token authorizes {token.parameters}, got {actual_parameters}" ) # Check 6: Revocation list if self.token_registry.is_revoked(token.token_id): return TokenValidationResult( valid=False, reason="Token has been revoked" ) # All checks passed return TokenValidationResult( valid=True, reason="Token validation successful" ) def _sign_token(self, token: CapabilityToken) -> str: """ Generate HMAC signature for token integrity. """ # Serialize token data (excluding signature field) token_data = token.to_dict() token_data.pop('signature', None) message = json.dumps(token_data, sort_keys=True).encode('utf-8') signature = hmac.new( self.signing_key, message, hashlib.sha256 ).hexdigest() return signature def _verify_signature(self, token: CapabilityToken) -> bool: """ Verify token signature to detect tampering. """ expected_signature = self._sign_token(token) return hmac.compare_digest(token.signature, expected_signature) def _generate_token_id(self) -> str: """Generate unique token identifier.""" return f"cap_{secrets.token_urlsafe(32)}"@dataclassclass TokenValidationResult: valid: bool reason: str
Why this works:
Self-contained authorization: The token contains all information needed to authorize execution. No database queries at execution time.
Cryptographic integrity: HMAC signature prevents tampering. Only the capability service (which has the signing key) can create valid tokens.
Parameter binding: Tokens encode exact parameters. Execution must match exactly. An agent can't get a token for "update Lambda X environment" and use it to update Lambda Y.
Single-use enforcement: Token registry tracks which tokens have been used. Prevents replay attacks where an agent reuses a token for repeated operations.
Time-limited validity: Tokens expire automatically. Even if an agent stockpiles tokens, they become useless after expiration.
Agent binding: Tokens are issued to specific agents. A token issued to agent A can't be used by agent B.
Integration with LangGraph Agents
Here's how capability tokens integrate with a LangGraph agent.
from langgraph.graph import StateGraph, ENDfrom typing import TypedDict, Annotatedimport operatorclass AgentState(TypedDict): messages: Annotated[list, operator.add] capability_tokens: list execution_results: listclass CapabilityAwareAgent: """ LangGraph agent that uses capability tokens for authorization. """ def __init__(self, capability_service: CapabilityService): self.capability_service = capability_service self.agent_id = self._generate_agent_id() self.graph = self._build_graph() def _build_graph(self) -> StateGraph: """ Build LangGraph workflow with capability token integration. """ workflow = StateGraph(AgentState) # Agent decides what to do workflow.add_node("decide", self._decide_action) # Request capability token for the action workflow.add_node("request_capability", self._request_capability) # Execute with token workflow.add_node("execute", self._execute_with_token) # Edges workflow.set_entry_point("decide") workflow.add_edge("decide", "request_capability") workflow.add_conditional_edges( "request_capability", self._check_capability_granted, { "granted": "execute", "denied": END } ) workflow.add_edge("execute", END) return workflow.compile() def _decide_action(self, state: AgentState) -> AgentState: """ Agent decides what operation to perform. No execution capability at this stage. """ # Agent reasoning happens here # For example: LLM decides to update Lambda environment # Propose operation (but can't execute it) proposed_operation = { 'operation_type': 'lambda:update_environment', 'resource_arn': 'arn:aws:lambda:us-east-1:123456789012:function:my-function', 'parameters': { 'environment_variables': { 'API_KEY': 'new-value' } } } state['proposed_operation'] = proposed_operation return state def _request_capability(self, state: AgentState) -> AgentState: """ Request capability token for proposed operation. """ operation = state['proposed_operation'] # Request token from capability service token = self.capability_service.issue_token( operation_type=operation['operation_type'], resource_arn=operation['resource_arn'], parameters=operation['parameters'], agent_id=self.agent_id, validity_seconds=300 # 5 minute window ) if token: state['capability_tokens'] = [token] state['token_granted'] = True else: state['token_granted'] = False state['denial_reason'] = "Operation not authorized" return state def _check_capability_granted(self, state: AgentState) -> str: """ Conditional edge: was capability granted? """ return "granted" if state.get('token_granted', False) else "denied" def _execute_with_token(self, state: AgentState) -> AgentState: """ Execute operation using capability token. """ token = state['capability_tokens'][0] operation = state['proposed_operation'] # Tool executor validates token before execution executor = ToolExecutor(self.capability_service) result = executor.execute( token=token, agent_id=self.agent_id, operation_type=operation['operation_type'], resource_arn=operation['resource_arn'], parameters=operation['parameters'] ) state['execution_results'] = [result] return stateclass ToolExecutor: """ Executes operations with capability token authorization. """ def __init__(self, capability_service: CapabilityService): self.capability_service = capability_service def execute( self, token: CapabilityToken, agent_id: str, operation_type: str, resource_arn: str, parameters: Dict[str, Any] ) -> ExecutionResult: """ Execute operation if token is valid. """ # Validate token validation = self.capability_service.validate_token( token=token, agent_id=agent_id, actual_parameters=parameters ) if not validation.valid: return ExecutionResult( success=False, error=f"Token validation failed: {validation.reason}" ) # Token is valid - execute operation try: result = self._perform_operation( operation_type=operation_type, resource_arn=resource_arn, parameters=parameters ) # Mark token as used self.capability_service.token_registry.mark_used(token.token_id) return ExecutionResult( success=True, output=result ) except Exception as e: return ExecutionResult( success=False, error=str(e) ) def _perform_operation( self, operation_type: str, resource_arn: str, parameters: Dict[str, Any] ) -> Any: """ Actually perform the operation. This is where AWS API calls, database operations, etc. happen. """ # Implementation depends on operation_type if operation_type == "lambda:update_environment": # Update Lambda environment variables import boto3 lambda_client = boto3.client('lambda') function_name = resource_arn.split(':')[-1] response = lambda_client.update_function_configuration( FunctionName=function_name, Environment={'Variables': parameters['environment_variables']} ) return response # Other operation types... raise NotImplementedError(f"Operation type {operation_type} not implemented")
Key integration points:
Agent proposes, doesn't execute: The agent's decision-making is separated from execution. It can propose any operation, but executing requires a capability token.
Token request is explicit: There's a dedicated workflow step for requesting capabilities. This makes authorization visible and auditable.
Validation before execution: The tool executor validates tokens before performing operations. Invalid tokens stop execution immediately.
Single-use enforcement: After execution, tokens are marked as used. The agent can't reuse the same token for repeated operations.
Graceful denial handling: If a capability is denied, the workflow terminates gracefully. The agent knows it was denied and why.
Pitfalls & Failure Modes
Capability token systems fail in production through predictable patterns.
Token Request Storms
An agent gets stuck in a loop making the same decision repeatedly. Each iteration requests a new capability token. The capability service is overwhelmed with token requests—thousands per minute. The service becomes a bottleneck, adding seconds of latency per operation.
Why it happens: Agents retry failed operations. If each retry requests a new token, retry logic creates request amplification.
Prevention: Rate-limit token requests per agent. Track request patterns and circuit-break agents making excessive requests. Implement exponential backoff for token requests after failures.
Parameter Drift Between Request and Execution
An agent requests a token for operation X with parameters P1. Before execution, the agent's context updates and it decides parameters should be P2. It tries to execute with the original token but different parameters. Validation fails. The operation never completes.
Why it happens: Time passes between token request and execution. Agent state evolves. Parameters that were correct at request time are stale at execution time.
Prevention: Minimize time between token request and execution. Make token requests immediately before execution, not speculatively. Implement token refresh mechanisms for long-running operations.
Capability Creep Through Separate Tokens
An agent needs to perform operation A (read data) and operation B (write data). It requests separate tokens for each. Both are approved based on individual policy checks. The agent now has combined capabilities that together violate policy—it can read sensitive data and write it to unauthorized locations.
Why it happens: Policies evaluate operations independently. They don't consider what combinations of operations an agent might perform.
Prevention: Track active tokens per agent. Implement aggregate capability limits. Revoke tokens when dangerous combinations are detected.
Clock Skew Breaks Expiration
Token expiration is based on wall clock time. Agent execution environment and capability service run on servers with slightly different clocks. A token that should expire in 5 minutes expires in 3 minutes or 7 minutes depending on which server's clock is used.
Why it happens: Distributed systems have clock skew. Different servers disagree on current time.
Prevention: Use logical clocks or relative time (TTL in seconds). Synchronize clocks with NTP. Build in clock skew tolerance—treat tokens as expired 30 seconds before nominal expiration.
Token Revocation Lag
An agent is compromised. Security team revokes all its tokens. But tokens were already distributed to the agent. The agent executes operations using cached tokens before revocation propagates. Operations succeed that should be blocked.
Why it happens: Revocation lists are eventually consistent. Updates propagate over seconds to minutes. Tokens cached by agents don't get revocation updates.
Prevention: Short token lifetimes (5 minutes max). Check revocation at validation time, not just issuance time. Implement token registry with strong consistency guarantees.
Summary & Next Steps
Capability tokens solve the fundamental authorization problem for non-deterministic agents. Traditional IAM systems grant persistent permissions scoped to resources and actions. This forces broad authorization that creates unacceptable risk when agents make probabilistic decisions.
Capability tokens provide operation-level authorization. Each token authorizes exactly one operation with exact parameters, valid for a limited time. Agents propose operations, a capability service issues tokens for authorized operations, and execution requires valid tokens. No ambient authority, no persistent credentials, fine-grained control.
The implementation requires three components: token structure with cryptographic signatures and parameter binding, capability service that issues and validates tokens, and integration with agent frameworks that separates decision from execution. These aren't theoretical—they're production patterns that work with LangGraph, AutoGPT, and custom agent implementations.
The operational challenge is balancing security and usability. Short token lifetimes improve security but require more token requests. Strict parameter binding prevents misuse but creates brittleness when parameters evolve. Token validation adds latency but prevents unauthorized operations.
Here's what to build next:
Implement token infrastructure before deploying agents: Build capability service, token registry, and validation logic before production. Retrofitting authorization is harder than building it correctly.
Define operation-level policies: Map agent operations to specific tokens with specific parameters. Start with most dangerous operations (writes, deletes, external API calls).
Integrate tokens with existing agents: Add token request and validation to your agent workflow. Make token acquisition explicit in the decision-to-execution path.
Monitor token patterns: Track token issuance rates, validation failures, and usage patterns. Anomalies indicate agent misbehavior or policy misconfigurations.
Build token management tools: Operators need visibility into active tokens, ability to revoke tokens, and audit logs of token usage. These are operational requirements, not nice-to-haves.
Capability tokens aren't optional for production agents—they're the minimum viable authorization model. The question is whether you implement them before or after discovering that ambient authority for probabilistic actors is indefensible.
References and Further Read
- Defeating Prompt Injections by Design
- Tenuo: Task-Scoped Authority for AI Agents
- Zero-Trust Auth, for Apps and AI
- Eclipse Biscuit: An authorization token with decentralized verification, offline attenuation and strong security policy enforcement based on a logic language
Related Articles
- Zero Trust Agents: Why 'Verify Every Tool Call' Is the Only Defensible Architecture
- Trust Gradients: Dynamic Permission Scaling Based on Agent Behavior
- The Autonomous Credential Problem: When Your AI Needs Root Access
Follow for more technical deep dives on AI/ML systems, production engineering, and building real-world applications: