Is crypto needed to protect the security of AI agents paying each other online?
The infrastructure race for agentic commerce is already producing winners.
Anthropic’s Model Context Protocol now runs on more than 10,000 public servers and pulls 97 million monthly SDK downloads, connecting AI applications to external tools and data.
Google’s Agent-to-Agent protocol launched in April 2025 with 50 partners and scaled to more than 100 supporting companies before moving under Linux Foundation governance.
On Jan. 11, Google unveiled the Universal Commerce Protocol, pulling in Shopify, Walmart, Target, Mastercard, Stripe, Visa, and American Express as early supporters, aiming to standardize how agents navigate live checkout flows.
Coinbase’s x402 protocol handles the payment transport layer, enabling automatic stablecoin payments over HTTP. The project reported more than 100 million payments processed across APIs, apps, and AI agents by late 2025.
That is a lot of standardization for a technology category that barely existed three years ago.
However, every one of those protocols addresses the same narrow slice: how agents connect, coordinate, and initiate payments.
None of them answers the harder commercial question sitting one step further down the stack: Who decides the work was actually done?
| Protocol / standard | What it does | What it does not solve | Why it matters in this story |
|---|---|---|---|
| MCP (Model Context Protocol) | Connects AI applications and agents to external tools, APIs, and data sources | Does not verify whether a task outcome was actually delivered | It handles the tool/data layer, not the trust layer around completed work |
| A2A (Agent-to-Agent) | Lets agents communicate and coordinate across systems or organizations | Does not hold funds in escrow or judge deliverable quality | It solves agent interoperability, but not conditional settlement |
| UCP (Universal Commerce Protocol) | Standardizes agent-driven commerce and checkout flows | Does not determine whether a purchased service or task was satisfactorily completed | It pushes agents deeper into real transactions, making the missing verification layer more visible |
| AP2 (Agent Payment Protocol) | Uses signed payment mandates to prove what an agent is authorized to spend | Proves permission, not whether the paid-for outcome materialized | It is an authorization standard, not a work-verification standard |
| x402 | Enables automatic payments over HTTP, including stablecoin payments | Moves money, but does not decide whether money should move only after work is verified | It is the payment transport rail, not the escrow/adjudication layer |
| Mastercard Verifiable Intent | Creates a trust and audit layer for proving user purchase authorization | Focuses on sanctioned purchases and dispute trails, not task completion itself | It shows incumbents are standardizing intent and accountability, but still not full outcome verification |
| ERC-8183 | Defines a job-based escrow flow: funds locked, work submitted, evaluator completes or rejects, expiry can refund client | Does not solve evaluator trust, disputes, or “agentic” identity by itself | It is the article’s hook because it targets the missing conditional payment / verification step |
| ERC-8004 | Provides a trust/reputation framework for agents and counterparties | Is not itself an escrow or payment-release mechanism | It is the likely composition layer for making ERC-8183-style evaluation more trustworthy |
| Oracle / staking / zkML / TEE-style trust systems | Potential ways to verify outcomes or back evaluator judgments with stronger guarantees | None is a settled standard for broad agentic commerce yet | These are possible answers to the article’s central question: who gets to judge that the job was done? |
Escrow as the missing primitive
ERC-8183, a draft Ethereum standard published Feb. 25, is crypto’s attempt to make that judgment programmable.
Strip the jargon, and the proposal is a minimal state machine for task-based commerce: a client locks the budget into escrow, a provider submits work, and an evaluator marks the job complete or rejects it.
Expiry refunds the client automatically. The spec calls this sequence: Open, Funded, Submitted, Terminal. Additionally, it explicitly states that the evaluator alone may mark a job as completed once work lands.
That architecture is narrower than its “agentic commerce” framing implies.
Critics in the Ethereum Magicians discussion thread pointed out that there is “nothing especially ‘agentic’” about the proposal. One commenter called it “a job registry with escrowed funds.”
The critique is accurate, and also the most useful thing about the story.
What ERC-8183 actually specifies is a programmable escrow primitive applicable to any task-based transaction, human or machine.
The AI framing is layered on top of a structure that predates agents entirely. The more interesting question is whether that structure is the one piece the stack currently lacks.

The authorization-verification gap
The payments incumbents building around agentic commerce are solving authorization, not verification.
Google’s Agent Payment Protocol frames payments around cryptographically signed mandates that prove what an agent was permitted to spend.
Mastercard’s Verifiable Intent, co-developed with Google and introduced on Mar. 5, creates a trust layer for proving what a user authorized and an audit trail designed for dispute resolution.
Those are robust answers to “Was this purchase sanctioned?” They say nothing about whether the purchased outcome materialized.
That gap is the productive contradiction in the stack.
A2A ensures agents can talk across organizational boundaries. MCP ensures they can reach the right tools and data. AP2 and x402 ensure money moves automatically. ERC-8183 proposes that the funds be held conditionally until an evaluator attests that the deliverable has cleared.
Whether that evaluator is the client, an oracle network, a staking system, or a zkML proof is left to implementers, but the spec explicitly names ERC-8004’s trust and reputation layer as the recommended composition point for higher-value jobs.
The power center nobody named
The evaluator role is where the proposal becomes politically interesting.
ERC-8183’s security section warns that a malicious evaluator can arbitrarily complete or reject jobs, recommends reputation or staking mechanisms for higher-value contracts, and acknowledges that there is no dispute resolution within the core spec.
One builder in the Magicians thread wrote that “the Evaluator is where the real complexity lives.” Another summarized the broader problem as “everyone verifies the payment, nobody verifies the work.”
Those observations point to a structural dynamic in any open agent marketplace: whoever controls evaluation controls the marketplace.
The spec’s design makes the tension explicit.
For enterprise deployments where the client and evaluator are the same entity, the complexity is manageable. For multi-party agent networks where a provider in one organization submits work to a client in another, the evaluator becomes a trust bottleneck with platform-level leverage.
ERC-8183 names the choke point without yet having a durable answer for it.
Where the stack actually stands
The adoption numbers suggest the surrounding layers are moving faster than verification.
Gartner says 33% of enterprise software applications will include agentic AI by 2028, and 15% of day-to-day work decisions will run autonomously by that year, up from 0% in 2024.
Deloitte pegs the global agentic AI market at $8.5 billion in 2026, rising toward $35 billion by 2030, with 75% of companies potentially investing in the category by the end of this year.
IBM and NRF reported in January that 45% of consumers already use AI during buying journeys, including 41% for product research.
That volume of agentic activity needs settlement infrastructure.
The bull case for ERC-8183 and its surrounding stack is that open agent marketplaces, covering research, code, inference, data, and microservices, generate enough cross-organizational, machine-to-machine commerce that on-chain conditional settlement becomes genuinely necessary.
The bear case is that payments incumbents and enterprise software absorb the verification problem before crypto builds a durable wedge.
AP2’s cryptographic mandates, Verifiable Intent’s authorization audit trail, and UCP’s live retailer integrations are already positioning card networks and Big Tech at exactly the layer that ERC-8183 targets from the other direction.


Who owns the judgment layer
If Gartner’s 2028 projections hold, and agentic AI handles a meaningful share of enterprise procurement, research outsourcing, and service buying, the highest-margin position in that stack will not be held by the model provider.
It will belong to whoever owns the moment of conditional payment, which is the infrastructure that holds funds, attests to outcomes, and releases money only when the work clears verification.
ERC-8183 may be that layer, or it may be marketplace escrow wearing better branding.
The Magicians thread is right that the underlying structure predates AI entirely. Yet the same holds for most financial primitives that turned out to matter.
Escrow predates the internet. Conditional payment predates blockchains.
The theory being stress-tested right now is whether the verification problem in agentic commerce is best solved by Big Tech’s authorization standards or by programmable on-chain escrow with composable trust layers.
Both approaches are live, neither is settled, and the answer will likely depend on where agents are doing the most economically meaningful work when adoption crosses the threshold that makes the infrastructure fight worth having.










































Post Comment