- Home
- The Thinking Wire
- Credit Governance Is the Template for AI Agent Spend
The fastest-growing line item in the enterprise P&L in 2026 has no standard governance model. Ramp’s April 2026 AI Index shows token spend up 13x since January 2025 and AI expense reimbursements tripled year-over-year. The second number is the more interesting one. Tripled reimbursements mean employees are putting AI tools on personal cards and claiming them back. That is the textbook signature of a category running ahead of its control plane. It is why corporate card programs exist at all.
Anton Zagrebelny, CTO of Stigg, published a field study this month after test-driving the credit-governance stacks at four AI-native companies. His taxonomy is the cleanest vendor-side map we have: OpenAI’s Decision Waterfall, Cursor’s Dollar-Denominated Pools, Clay’s Dual-Currency Separation, Vercel’s Observability-First Governance. Four primary axes, four non-composable architectures, four defensible product bets.
Read him closely and the AI vendors are not inventing a new primitive. They are re-implementing stored-value ledgers, authorization cascades, and hold-and-settle patterns that SaaS billing and prepaid-card infrastructure shipped decades ago. The vocabulary is new. The mechanisms are not.
That would be a tidy essay if the analog held. It doesn’t. Not entirely. There is a gap between what credit governance can enforce and what agents can do inside that enforcement window, and that gap is where CFOs are about to learn an expensive lesson.
The four models, briefly
OpenAI: Decision Waterfall. One synchronous evaluation path resolves every applicable rule — rate limits, free tiers, credits, enterprise contracts — in a single pass. Three-level hierarchy: Organization, Projects, API Keys. Tier promotion on cumulative spend and account age. The bank analog is the authorization cascade on a card transaction: issuer stand-in, hot file, velocity check, all resolved in sub-100ms before the merchant releases goods. OpenAI’s team reportedly rejected third-party billing platforms because they could not match the latency. That is the same reason issuers don’t outsource authorization.
Cursor: Dollar-Denominated Pools. A credit pool whose value equals the plan price, in dollars. Per-user pools on Teams, shared pools on Enterprise. Cursor migrated away from abstract “fast requests” in June 2025 because nobody could answer what a fast request was worth. The bank analog is a demand deposit account — a dollar balance drawn down by priced transactions where the priced unit is visible at transaction time.
Clay: Dual-Currency Separation. Actions (one per platform task, fixed) versus Data Credits (0.5 to 10+ per third-party lookup, variable). Two ledgers, different risk classes, separate limits. The bank analog is separate facilities by purpose — working capital, capex, trade finance — each with its own ceiling because the risks don’t substitute for each other.
Vercel: Observability-First Governance. Request-time attribution on six dimensions (day, user, model, tag, provider, credential type). Webhook-triggered enforcement at spend thresholds. No prescribed hierarchy. The bank analog is expense tagging with real-time alerts — but enforcement is administrative, not transactional.
Each model is internally coherent. Each optimizes a different primary axis. An enterprise buying three of them inherits three incompatible governance surfaces and will discover this in the quarter after procurement signs.
What transfers, and what doesn’t
The primitives that transfer cleanly from credit to AI are the ones credit systems solved under strict latency: authorization cascades, hold-and-settle, hierarchical entitlements, attribution metadata. These are genuinely portable. Use them. Name them honestly.
What does not transfer is the assumption that a centralized ledger can enforce faster than the workload can burn. This is where the analog breaks, and where Zagrebelny’s four models all share the same blind spot.
In a card network, authorization runs ahead of value transfer. The merchant cannot dispense the gas until the issuer says yes. The enforcement window precedes the cost.
Agents do not respect that ordering. An agent can chain tool calls, spawn sub-agents, issue parallel requests, and retry with backoff faster than any centralized ledger can react. A runaway agent can burn $10,000 in ninety seconds inside a single API key that the ledger still sees as “within limit.” The authorized envelope is valid. The emergent cost is not.
All four models Zagrebelny documents treat the API key as the enforcement boundary. When the principal behind that key fans out into a thousand sub-requests — each individually authorized, each individually priced, cumulatively pathological — the ledger is a spectator. This is not a bug in any of the four models. It is a structural mismatch between credit-decision latency and agent-execution latency.
Three failure modes follow. Ledger races, where parallel agent calls both pre-reserve against the same balance. Retry storms, where partial-failure cascades cumulatively exceed the original envelope. Sub-agent attribution loss, where spawned workers either inherit the parent’s key (and exhaust it) or acquire their own (and bypass central policy). Card networks never saw any of this. Cards do not spawn sub-cards. Agents do.
What this means for the enterprise
The implications compound, and they are not the implications a FinOps dashboard is built to surface.
First, credit governance remains necessary. The primitives are correct. Every enterprise running AI at scale needs a hierarchy, a waterfall, hold-and-settle for variable-cost operations, and attribution at request time. That is the table stakes. This is the finance-domain answer to the cross-domain tooling gap — the same functional empty cell that marketing, design, and legal teams are also discovering in 2026.
Second, credit governance is no longer sufficient. Enforcement has to move into the agent runtime. Per-agent budget envelopes. Local kill-switches. Hard per-invocation ceilings that fire before the ledger is queried. This is what the operations tax of running AI at scale starts to look like once the agents outnumber the humans.
Third, the counterparty topology inverts. In traditional credit, the issuer bears the risk of the borrower failing to pay. In AI credit, the enterprise bears the risk of its own agents misbehaving. The vendor is the merchant. There is no underwriter pooling risk across enterprises and charging a spread. Every buyer is self-insuring, which means every buyer needs its own risk capital, its own reserve policy, and its own escalation protocol. The agentic commerce governance split made this visible on the payments side first. Engineering spend is next.
Fourth, token consumption becomes a managed performance axis rather than a cost leak. Governed well, high spend is a signal of leverage. Governed badly, it is the signal of tokenmaxxing without guardrails. The difference is whether finance has a control plane that can see what engineering is burning, at the same cadence engineering burns it.
Fifth, this is the cleanest example so far of governance leaving the engineering silo. The four models are engineering artifacts solving a finance problem. Neither function can own the outcome alone.
What CFOs should ask their platform teams before Q3
Five questions. Answer them before the next budget cycle, not after.
- Which of the four credit models does each of our current AI vendors implement, and where do they diverge? If the answer is “we have not mapped this,” map it.
- What is our enforcement latency versus our agents’ execution latency? If agents can burn faster than the ledger can block, the ledger is a reporting tool, not a control.
- Where are our per-agent budget envelopes, and who owns them — engineering, finance, or nobody?
- How much shadow AI spend is coming through expense reimbursement right now? If the number is rising, the corporate procurement layer has already failed for this category.
- When an agent runs away, what is the mean time between the first unauthorized dollar and the kill-switch firing? If the answer is measured in minutes, not milliseconds, the architecture is wrong.
The vendors are building credit governance in public. Watch what they ship. Borrow their primitives. Accept that their models are vendor-side ledgers and that enterprise-side credit policy — the layer that spans every vendor, every agent, every budget — is yours to own. Nobody is going to build it for you in 2026.
This analysis synthesizes Four Models for Credit Governance by Anton Zagrebelny (April 2026), the Ramp AI Index (April 2026), AWS Bedrock cost management guidance (2026), and Mastercard Agent Pay governance framework (2026).
Victorino Group helps finance and platform teams design AI spend governance that enforces at agent speed. Let’s talk.
All articles on The Thinking Wire are written with the assistance of Anthropic's Opus LLM. Each piece goes through multi-agent research to verify facts and surface contradictions, followed by human review and approval before publication. If you find any inaccurate information or wish to contact our editorial team, please reach out at editorial@victorinollc.com . About The Thinking Wire →
If this resonates, let's talk
We help companies implement AI without losing control.
Schedule a Conversation