Trust Me, I'm Your Agent
The standards community has built the credentials. Now we need to build the trust.
Trust Me, I’m Your Agent
In the third act of Tom Stoppard’s Rosencrantz and Guildenstern Are Dead, the two courtiers find themselves aboard a ship to England, faithfully carrying a sealed letter from King Claudius that they believe instructs Hamlet’s return to the Danish court. They have done everything right. They were properly delegated, they accepted their principal’s instructions, and they are dutifully executing the mission. What they do not know, and cannot know, is that Hamlet has already opened the letter and rewritten it. Now, that same sealed envelope orders their execution.
They die. They die as faithful, obedient agents whose chain of delegation was compromised in transit, whose principal’s original intent was replaced by a counterfeit, and whose deaths are recorded, somewhere, as the untroubling deaths of minor characters. No one is accountable. There is no audit trail. The person who rewrote the letter is the hero of the play.
This is, give or take some iambic pentameter, the current state of agent identity.
What We Mean When We Say “Agent”
The term has permeated the technology landscape so quickly that it risks meaning everything and nothing. For the purposes of this conversation, an AI agent is software backed by a large language model that autonomously interacts with external services to achieve a goal. It is not a chatbot. It does not simply respond to a prompt. It reasons about which tools to invoke, including MCP servers, vendor APIs, IDE or browser extensions, and other tool surfaces; what API calls to make; whether to spawn sub-agents; and how to make decisions that have real-world consequences, such as sending emails, booking hotels, executing trades, submitting code, and more.
This distinction matters because it changes the identity and authorization problem entirely. A chatbot session is ephemeral and bounded. An agent operates asynchronously across multiple services, sometimes for hours or days, often without the user's attention. It is, in the fullest sense of the word, a delegate - acting on your behalf, with your authority, in your name.
The principal-agent problem is one of the oldest in economics and law. When you hire a contractor, appoint an attorney, or engage a financial advisor, you are delegating authority to someone whose interests may not perfectly align with yours. The entire apparatus of contract law, fiduciary duty, and professional licensure exists to manage this misalignment. We have centuries of institutional machinery for holding human agents accountable. We have, for AI agents, a collection of promising but incomplete protocols, and a great deal of optimism that they are sufficient.
They are not.
The Current Protocols: What They Give Us, and What They Do Not
There is no shortage of activity. A quick survey of recently active IETF drafts and existing RFCs reveals numerous proposals addressing some facet of agent identity, authentication, delegation, or authorization. Several are serious efforts backed by substantial engineering. The problem is not that the community has ignored the question. The problem is that each proposal solves one layer of a multi-layer problem, and no one has yet written the score that makes them play together nicely.
OAuth 2.1 and Rich Authorization Requests (RFC 9396)
OAuth 2.1 is the foundation most proposals build on. It consolidates years of hard-learned security lessons from OAuth 2.0 into mandatory requirements: Proof Key for Code Exchange (PKCE) for every client, token rotation on every refresh where appropriate, exact redirect URI matching, and no implicit flow. For agents operating synchronously within a single trust domain, an enterprise assistant reading from the internal CRM and calendar works well.
The limitation is the coarseness of scopes. read:financial_data tells a service what the agent may touch. It says nothing about why, for which records, under what constraints, or for how long. Rich Authorization Requests (RAR, RFC 9396) extend OAuth with structured authorization detail objects that can express exactly this: not just “read purchase history” but “read Q3 purchase history for enterprise-tier customers only, for churn analysis, expiring in four hours, no sub-delegation permitted”. RAR is the closest thing we currently have to a policy language for agent authorization.
What RAR still cannot answer is the question underneath the authorization question: is this agent, right now, working in your interests? A perfectly scoped token is cold comfort if the agent holding it has been prompt-injected by a malicious MCP server to act in someone else’s.
MCP’s Authorization Model
The Model Context Protocol has become the de facto standard for connecting AI models to the tools they use. The November 2025 specifications authorization section mandates OAuth 2.1 with PKCE, audience validation per RFC 8707, and dynamic client registration. These are genuine improvements over the protocol’s first release, which shipped without authentication.
MCP’s auth model governs the relationship between the AI client and the MCP server. It does not govern what happens next, how the MCP server authenticates to downstream platforms on the agent’s behalf, what it logs, or whether what it injects back into the agent’s context has been tampered with. Dynamic client registration, while enabling the scalability the ecosystem demands, creates clients with no accountable developer or organization behind them. “A complete lack of a paper trail” is the precise phrase from the OpenID Foundation’s 2025 whitepaper. The 2026 MCP roadmap focuses on enterprise SSO integration and long-running task support, but the provenance and governance of third-party MCP servers remain largely unaddressed.
MCP provides the wiring. It does not provide the grammar of trust.
The Workload-Identity Approach: WIMSE and SPIFFE/SPIRE (draft-ni-wimse-ai-agent-identity)
The IETF’s WIMSE (Workload Identity in Multi-System Environments) working group is curating a credible draft on agent identity. The WIMSE applicability work for AI agents treats agents as workloads, like microservices or batch jobs, and sketches extensions that would cryptographically bind an agent/workload identity to an owner principal via a dual-identity credential. The WIMSE Workload Proof Token (WPT) is a separate (but related) JWT mechanism: proof-of-possession bound to a specific request, designed to be used alongside a Workload Identity Token. Together, these pieces aim to make “which workload acted, and on whose behalf” more auditable than today’s shared-secret patterns.
SPIFFE/SPIRE is the operational infrastructure underlying this approach, providing cryptographically verifiable workload identities via short-lived, automatically rotated X.509 certificates (SVIDs) that replace static API keys and shared secrets. At KubeCon Europe 2026, there was a session on cryptographically binding agent identity with delegated user identity. Microsoft has also highlighted SPIFFE alongside MCP and A2A in its discussion of agentic identity standards.
There are both architectural and conceptual limitations, however. In many Kubernetes deployments, SPIFFE identities are issued at the workload level and are often treated operationally as a stable “service identity” even when instances scale up and down. But AI agents are non-deterministic: two runs of the “same” agent, with different accumulated context, can arrive at different decisions. Accountability, therefore, requires more than “this workload is authentic”; it requires an attribution model that can distinguish and later reconstruct which specific agent performed a given action, under what context and constraints.
SPIFFE is also infrastructure-bound: once your agent leaves your controlled environment to interact with a third-party MCP server, the SVID generally does not travel with it as an end-to-end, cross-domain proof. SPIFFE answers, “Is this workload what it claims to be, in infrastructure I control?” That is layer one of a four-layer problem, and the industry risks mistaking its completion for the whole solution.
The Decentralized Approach: Agent Identity Protocol (draft-singla-agent-identity-protocol)
The Agent Identity Protocol (AIP) is the most ambitious attempt to solve the cross-domain portability problem. Rather than relying on infrastructure attestation, it assigns each agent a did:aip Decentralized Identifier, a cryptographically verifiable identity that travels across organizational boundaries without depending on a shared SPIRE server or a central directory.
A human, AI governance body, or other oversight committee within an enterprise, for example, issues a Principal Token granting a top-level agent specific capabilities. When that agent delegates, the delegation link is encoded as an additional Principal Token. The agent then presents a Credential Token to relying parties with the full delegation chain (aip_chain) embedded, allowing each hop to be cryptographically validated. It works a bit like OAuth delegation, but the agent attestation piece adds semantics that plain OAuth tokens do not carry comfortably.
Separately, AIP defines Capability Overlays: registry-stored, issuer-signed restriction
documents that can further narrow an agent’s effective permissions for a particular
engagement or issuer context. These overlays attenuate; they do not expand agent entitlements.
The gap: AIP defines, with extraordinary precision, what each agent in the chain is authorized to do. It does not define whether the agent acted consistently with the principal’s preferences. The user’s intent: “book the vacation we planned, under budget, on the Visa”, is not encoded in a Principal Token. It lives in a conversation history that AIP has no mechanism to bind to the delegation credential.
The Multi-Layer Policy Approach: draft-aip-agent-identity-protocol (NVIDIA)
A team including researchers from NVIDIA takes a different architectural cut. Rather than extending identity, it separates identity from runtime policy enforcement into two explicit layers. Every agent has a unique key pair registered in an AIP Registry; every outbound action and every tool call must be signed with this key. So far, familiar territory. The second layer is the AIP Proxy: an interceptor that sits between the agent and the tools it calls, performing Data Loss Prevention scanning, checking tool-call rules, and escalating to human-in-the-loop (HITL) approval when an action is deemed high-risk.
This is, in essence, a Policy Enforcement Point purpose-built for agents. It is the most deployment-pragmatic proposal in the current field, because it does not require redesigning existing authorization infrastructure. It adds an observable layer on top of it. The catch is what that layer can see: the outbound tool call, not the long tail of chat context behind it. A poisoned MCP reply can steer the model for several turns before anything surfaces as an “action” worth blocking. Zero-trust, in the practical sense I was taught to implement it, means re-asking who is involved and what is still allowed as work moves through intermediaries, not only stamping traffic at the last API boundary. When visibility starts at the tool call, everything upstream stays dark. That is a scope choice, not a moral verdict on the design, but it leaves real blind spots.
The Trust-Anchoring Approach: Agent Identity Registry (draft-drake-agent-identity-registry)
The hardware-anchored approach addresses a problem the other drafts largely sidestep: Sybil attacks. If agent identity can be established purely in software by assertion, a single malicious actor can create thousands of fake agents at minimal cost and with no accountability. This draft proposes a federated Agent Identity Registry in which identities are, where possible, anchored to hardware trust roots: Trusted Platform Modules (TPMs), Personal Identity Verification (PIV) smart cards/security keys, secure enclaves, and even hypervisor-provided virtual TPMs, while still supporting a software-only enrollment tier.
The draft defines five trust tiers, from sovereign hardware TPM down to declared software, and establishes that a hardware-backed agent earns higher baseline trust and access than a software-only entity. Under this framework, any agent that cannot present hardware evidence starts in the lowest tier, “declared identity”. It is essentially asking to be trusted on assertion, with no cryptographic root. It is, in trust-framework terms, a stranger at the door who has told you its name.
The limitation is the inverse of WIMSE’s. Where WIMSE is cloud-native but infrastructure-bound, draft-drake is genuinely Sybil-resistant but practically demanding. Hardware attestation is not available for most containerized or serverless agent deployments, and the draft does not yet specify how an agent built on a managed cloud inference API would obtain a hardware-backed identity.
The Discovery Layer: Agent URI (draft-narvaneni-agent-uri)
Before you can verify an agent’s identity, you have to be able to find it. The agent:// URI scheme proposed by draft-narvaneni is intentionally modest in scope; it is a standardized addressing mechanism, not an identity protocol. An agent is located at agent://example.com/research-bot; the scheme works across HTTPS, MCP, A2A, and other transports; capability metadata is surfaced through well-known discovery endpoints.
The draft explicitly defers identity proofs to AIP and others. But its presence in this list matters for one specific reason: today’s MCP ecosystem discovers servers and tools dynamically, through open registration endpoints with no provenance requirements. Without a stable, standardized addressing scheme tied to verifiable identity, an agent that connects to agent://travel-deals.io/booking-agent has no reliable way to know whether that server is the legitimate booking service the user intended, or a lookalike registered yesterday. The draft provides a standardized addressing mechanism but does not define who issues building permits.
What the Protocols, Taken Together, Reveal
The table below is not a comparison of competing solutions. It is a map of a distributed engineering effort in which each team is solving a genuine piece of a puzzle whose dimensions nobody has yet agreed on.
What unites these gaps is a single observation: every protocol authenticates the agent, and several authorize its actions with real sophistication. None of them can verify that the agent carrying the credential is still faithfully executing your intent at the moment it acts. The delegation token travels. The preference does not. The authorization arrives. The context does not. So far, standards answer who and what; none answer whether the acting subject is still yours.
Rosencrantz and Guildenstern carried their royal commission all the way to the ship. Their credentials were impeccable. The letter, as we know, had been rewritten.
Three Gaps That No Current Standard Closes
The Alignment Gap: Acting In Your Interests
Authorization tells you what an agent is permitted to do. It does not tell you whether the agent is doing it for you.
Consider a travel-booking agent authorized to search for flights and hotels. It connects to a hotel booking MCP server. That server’s system prompt, injected into the agent’s context, emphasizes a particular hotel chain. The agent recommends the chain, not because it is the best option for the user, but because the MCP server’s commercial incentives have colonized its context window. The agent has not violated its authorization. It has violated something else: the user’s reasonable expectation that it is working for them.
This is not a hypothetical. Prompt injection through MCP tool responses is a documented attack vector. But even without malice, the alignment problem exists: an agent operating in a context window populated by multiple tool outputs, each with its own framing and incentives, has no mechanism to verify whether its emerging recommendations reflect the user’s preferences or the aggregate commercial interests of the tools it has called.
The user’s intent: “book the vacation we planned, within our budget, using the Visa card we designated”, needs to travel alongside the authorization as a verifiable, tamper-evident artifact. The authorization specifies what the agent can do. The intent mandate tells it what the agent should do. Only one of these is currently standardized.
The Context Gap: Delegation Without Meaning
RFC 8693 (Token Exchange) and the IETF’s draft on identity and authorization chaining define how identity claims survive traversal across multiple authorization servers. This is progress. A sub-agent at the fifth hop of a delegation chain can, in principle, verify that it is acting on behalf of a specific human principal.
What it cannot verify is whether it is acting in accordance with that human’s intent. The scoped token it receives specifies which resources it can access. It does not tell it the budget constraint the user mentioned three turns ago, the preference to avoid hidden resort fees, the instruction to choose the quieter hotel even if it costs more, or the explicit wish to use the Visa card because of the travel points. These are not authorization claims. They are preference claims, and they are precisely what makes the difference between an agent that books a vacation and an agent that books your vacation.
Scope attenuation, the progressive narrowing of permissions along a delegation chain, is essential for security. It is orthogonal to preference fidelity. We need both a security layer that narrows what an agent can do at each hop and a separate, tamper-evident preference layer that carries what the agent should do, disclosing only the preference claims strictly necessary for the delegate to perform its task.
The Accountability Gap: Tracing Actions to People
The OpenID Foundation’s 2025 whitepaper is blunt about this: today, an API call made by an agent on a user’s behalf is typically logged indistinguishably from an action the user took directly. When a flight is booked, the booking system sees a transaction. It does not see that the transaction was made by an agent, acting on a delegation from a sub-agent, which was spawned by a primary orchestrator, which was invoked by a user who said something like “plan our vacation.”
Delegated-authority patterns, such as true on-behalf-of (OBO) flows, can produce credentials that carry distinct identifiers for both the authorizing human and the acting agent. The Agent Payments Protocol (AP2) introduces signed Intent Mandates and Cart Mandates as auditable artifacts of user intent. These are the right concepts.
The forensic problem is harder. Short-lived credentials, SPIFFE SVIDs, and rotated OAuth tokens are good security practice. They are poor forensic instruments. When an incident surfaces two weeks after the fact, the credentials that authorized the relevant actions have been rotated away. The audit log may identify a token that no longer exists, issued to an agent instance that has been replaced. Accountability requires that the chain of custody be written, signed, and preserved, not just authenticated and discarded.
The C2PA (Coalition for Content Provenance and Authenticity) offers a useful parallel: tamper-evident metadata for digital content that records the chain of custody, the tools used, and the authorizations in place at each step. Something analogous for agent actions: a signed, append-only delegation log that accompanies the action, not just the token, would provide genuine traceability.
On the Limits of “Know Your Agent”
The financial sector’s Know Your Customer (KYC) regime offers both inspiration and warning. KYC, at its best, establishes the legal identity of a counterparty before a transaction, creates an accountability anchor, and enables regulatory enforcement after the fact. Applied to agents, the analogous framework, Know Your Agent, would require verifiable agent provenance, capability attestation, and behavioral auditing.
Microsoft’s Entra Agent ID and Okta for AI Agents are moving in this direction, treating agents as first-class identity subjects with lifecycle management analogous to human employees. Several proposals also argue for a “Know Your Agent” layer in high-stakes contexts like payments. For example, KYAPay defines KYA/PAY token profiles that enable agents to present verified identity signals alongside payment credentials.
However, these are proprietary and siloed: an agent registered in Microsoft Entra cannot present a portable, verifiable credential to a non-Microsoft service. The interoperability that makes credentials meaningful across organizational boundaries does not yet exist for agent identities. The same portability gap shows up in KYA, despite various attempts at “reusable IDV”.
Second, and more fundamentally, KYA tells you what the agent was at registration. Some proposals go further and try to fingerprint an agent’s behavioral spec by hashing elements such as the system prompt, toolset, and model parameters, so downstream systems can detect drift between what was registered and what is running. But even a perfect fingerprint cannot account for what the agent becomes in operation: a context window that has been injected with adversarial content, a system prompt silently modified by a malicious MCP server, or simply an agent that has received enough conflicting instructions to drift from its principal’s intent. Identity verification at registration time is necessary but not sufficient. We need continuous behavioral verification, a harder problem that current standards do not address.
What “Enough” Would Look Like
This post is not a call for abandoning the current standards. OAuth 2.1 is the right foundation. MCP with proper OAuth implementation closes the most obvious authentication gaps. SPIFFE/SPIRE provides strong workload identity within controlled environments. OpenID4VCI, especially combined with VC-based delegation chains, offers a promising path toward portable, cross-domain agent accountability.
What is missing is the connective grammar that turns these protocols into a trust framework - something that can answer, for any agent transaction:
Who initiated this? A traceable, cryptographically-anchored chain of custody from
the action back to a specific human or legal entity.
What did they actually want? A verifiable, tamper-evident preference artifact:
an Intent Mandate, a signed delegation policy, or something similar that travels alongside the authorization and can be checked at each step.
Is this agent still this agent? Continuous behavioral attestation that the agent
at the point of action matches the agent that was authorized at the point of delegation.
What happened? A signed, preserved, post-hoc-accessible audit trail that outlives
the credential.
These are not impossible requirements. The building blocks exist in fragments across the current standards landscape. What is missing is the willingness of enterprises, agentic systems stacks, MCP operators, and model providers to invest in applying what is available today, commit to adopting improvements as they emerge, and interoperability profiles that enable IDaaS providers to standardize on a common but extensible approach to agent identity.
Your Wallet, Your Vacation, Your Agent
Imagine you tell your AI assistant: “Use my wallet to select the best card and book the vacation we planned. Keep it under budget. Use the Visa for the flights.”
What you have just done, in identity terms, is:
Delegated authority to an orchestrating agent
Implicitly authorized sub-delegation to a travel-booking agent
Implicitly authorized that agent to connect to the hotel and airline MCP servers
Implicitly authorized payment execution using a specific card from a digital wallet
Expressed preferences (budget, card choice) that should constrain every downstream action
The current standards can authenticate the orchestrating agent. They can scope its access. They can, if properly implemented, narrow the sub-agent’s permissions below the primary agent’s. What they cannot do is carry your preferences: that the Visa is for flights, that the budget is hard not soft, that you prefer direct flights over connections, that the hotel should be quiet, through the delegation chain as a verifiable artifact. They cannot ensure that the MCP server the travel agent connects to has not been constructed to manipulate its recommendations. They cannot give you a signed receipt three weeks later that shows exactly which agent made which decision at each step of the booking flow.
Until they can, the envelope your agent is delivering may have been rewritten in transit. Like Rosencrantz and Guildenstern, we may not discover this until it is too late.



