Payment Modes: Stripe Checkout and Shared Payment Tokens
Why Two Modes Exist
The two modes map to the two trust situations in agentic commerce. In a conversation with your MCP tool, the agent is a client application: it can assemble the cart and create the checkout, but it has no authority to move money, and no shopper should be typing card numbers into a chat. The correct artifact to hand over is a payment page operated by the processor, where card entry, 3DS challenges, wallets, and receipts are Stripe's problem and the shopper can see exactly who is charging what. That is direct mode.
On an ACP platform, the trust shape inverts. The platform has already collected and vaulted the shopper's payment method and obtained their authorization for this purchase. What it sends the merchant is not a card but a Shared Payment Token: a single-use, scoped credential that can be charged once, for this merchant, within an authorized ceiling. The merchant charges it server-to-server and returns the outcome in the same API call. No card data ever touches the merchant or the agent, which keeps PCI exposure minimal on every side. That is delegated mode, and it is what makes instant in-assistant checkout possible. The protocol context lives in What Is ACP.
Direct Mode: Hosted Stripe Checkout
When the MCP checkout tool runs (or an embedder calls createDirectPayment on the checkout service), the module builds a Stripe Checkout Session from the checkout's line items, each line carrying its title, quantity, and exact integer-cent unit amount, with your configured successUrl and cancelUrl as the landing pages and the module's session id stamped into the Stripe metadata. Stripe returns a hosted payment URL, the module freezes the session in in_progress so nothing can re-price it while the shopper is looking at it, and the URL goes back to the agent.
Payment confirmation is asynchronous by nature in this mode. The shopper might pay immediately, in ten minutes, or never. The signal arrives as a Stripe webhook (checkout.session.completed or payment_intent.succeeded), which the module verifies and matches back to the checkout session through the metadata, then marks the session completed and fires your order webhook. The webhook wiring is its own short setup, covered in how to set up Stripe webhooks. If tax is enabled, the hosted page itself collects the billing address and applies Stripe's automatic tax, so the shopper sees final tax before paying, details in sales tax with Stripe Tax.
Delegated Mode: Shared Payment Tokens
On the ACP complete endpoint, the platform's request carries payment_data with a token and usually a billing address. The module first applies the billing address and re-prices the session so tax is computed against the buyer's real location, then creates a Stripe PaymentIntent using the token as a shared_payment_granted_token with immediate confirmation, charging exactly the session's total, the same integer-cent amount the platform displayed to the shopper. Three outcomes map to three clean results: success completes the session and attaches the order, requires_action surfaces as a requires_3ds error the platform resolves with the shopper, and any decline becomes payment_declined with Stripe's decline code in the message. Stripe-side errors are caught and translated, the raw exception never crosses the API boundary.
The Mock Provider
With no STRIPE_SECRET_KEY in the environment, the module swaps in a mock provider and says so in the logs. The mock is deterministic and built for tests: createPayment returns a fake hosted URL containing the session id, and completeWithToken obeys token conventions, tok_decline produces a declined result, tok_3ds produces a requires-action result, and anything else succeeds with a predictable payment reference. It also records every call it receives, so a test suite can assert a checkout charged exactly once. This is the same seam a team would use to integrate a different processor: PaymentProvider is four methods (createPayment, completeWithToken, refund, verifyWebhook), and nothing else in the module knows Stripe exists.
Amounts, Declines, and Refunds
The amount charged is always the session's total entry, the same server-computed number the totals math produced from catalog prices and tax, never a figure from the client. All math is integer minor units with overflow guards, and quantity ceilings keep line totals far inside safe integer range, the practices behind that are described in the security model.
Declines deserve a moment of operational honesty: they are normal. A few percent of legitimate card attempts decline for issuer reasons, and in delegated mode your decline rate is also shaped by the platform's vaulting quality. The module's job is to report them cleanly and keep state consistent, a declined session stays ready_for_payment so the shopper can try again, while a completed session can never be charged twice thanks to the idempotent complete semantics. The provider interface also carries refund(paymentRef, amount), wired to Stripe refunds, supporting full or partial refunds against the payment reference stored on the completed session, your support tooling can call it directly through the library. For the merchant-side economics of processing, declines, and disputes, our payment processing pillar covers the landscape beyond this module.
Choosing and Combining the Modes
In practice you do not choose, you run both, because each serves a different surface of the same store. The MCP tools use direct mode for any client, the ACP endpoints accept delegated tokens from platforms, and both paths price from the same catalog, share the same session rules, and hand completed orders to the same order webhook. The only configuration difference is which secrets you set: a Stripe key enables both modes, the webhook secret makes direct-mode confirmations trustworthy, and the ACP bearer token opens the platform surface. A sensible rollout order is direct mode first, it monetizes your MCP tool immediately, then the ACP surface when a platform integration becomes concrete.
What the Shopper Experiences in Each Mode
The difference is most visible from the buyer's seat. In direct mode, the conversation produces a link: the agent says the total, the shopper clicks through to a Stripe-branded page that lists the exact line items the agent quoted, pays with card or wallet, and lands back on your success page while the agent confirms the order in chat a moment later. The shopper sees the processor's name and your line items, which is reassuring precisely because it is familiar, this is the same hosted checkout used across the web. In delegated mode there is no link at all: the assistant shows a native confirmation with items, tax, and total, the shopper approves once, and the order confirmation appears in the same thread seconds later. The first mode trades a click for universal availability, the second removes every click but depends on platform enrollment.
Both experiences end identically on your side: a completed session, a payment reference, and one order payload delivered to your systems. That symmetry is deliberate, your fulfillment code should never care which door the money came through, and with this module it does not have to.
