- Home
- The Thinking Wire
- Token to Outcome Becomes the Bill
Two signals landed in the same week, from people who do not work together, and they pointed at the same conclusion. A venture investor argued that the link between tokens and outcomes is becoming the place where executive power concentrates. A consulting CEO told a newspaper that three-quarters of his firm’s largest AI engagements now bill on variable terms, because clients refuse to pay for consumption that does not move the P&L. One of them was writing theory. The other was reporting from a billing department. They agreed.
I have written about token cost as a board and engineering signal, about the workforce inflection that comes when tokens become the unit of work, and about the collapse of per-seat licensing. Those pieces argued that tokens are the new cost line. That argument is settled. This week moved the story forward. The services industry has started repricing itself around outcome attribution, and a working theory of marginal token utility arrived at the same moment to explain why.
The billing department speaks first
Christoph Schweizer, CEO of Boston Consulting Group, sat with The Wall Street Journal in May 2026. BCG grew revenue 7% to $14.4 billion. The number that matters is buried in how that revenue is now structured. Roughly three-quarters of the firm’s largest AI engagements are now billed on variable-fee terms rather than a fixed project price. Pure outcomes-based deals, where the fee is tied directly to a measured business result, are still rarer, “significantly less than a third” of all work. But the direction is unmistakable.
Schweizer gave the reason in plain language. Clients want AI work that does “not just drive token consumption, but actually see changes in the P&L and in how people work.” Read that twice. The CEO of one of the three largest consulting firms on earth is telling a newspaper that token consumption is the thing buyers no longer want to pay for. They will pay for a changed P&L. They will pay for changed work. The meter running is not a result. It is an input that may or may not produce one.
That is a structural admission. When a firm that bills by the hour and by the deliverable starts tying its fee to a measured outcome, it is accepting attribution risk it used to push onto the client. BCG is betting it can prove its AI work moved a number. To make that bet, it has to measure the trace from spend to result. The bill itself is now an attribution claim.
The theory arrives the same week
Jaya Gupta, an investor at Foundation Capital, published a long-form piece on X in May 2026 titled “Token Budget Wars.” Her argument runs underneath the BCG news like a foundation under a building. The economics of agentic AI, she writes, are not legible through consumption dashboards, and the failure to make them legible is where the next control layer will be built.
Her math is the useful part. The same workflow, with the same input, can vary 5 to 10 times in token cost depending on how the agent plans, retries, and carries context. A completion-rate drop from 90% to 70%, which sounds modest, raises the effective cost per resolved task by roughly 28%, because every failed attempt still burned tokens before it failed. And context cost does not scale linearly. It scales roughly O(n squared), because each new token attends to every prior token. Double the context and you roughly quadruple the cost of processing it.
Stack those three facts and the consumption dashboard becomes actively misleading. A team can cut its token count and still raise its real cost per outcome, because it shipped a cheaper agent that fails more often. A team can raise its token count and lower its cost per outcome, because it bought reliability that finishes the job on the first try. Tokens consumed tell you nothing about value produced. Gupta’s line lands the point: “SaaS usage told you the software had been adopted. AI usage tells you the meter is running. It doesn’t tell you whether your company is cooking.”
Why the two signals are one story
The convergence is the news. Gupta is describing, from first principles, why consumption metrics fail to capture value. BCG is acting on exactly that failure, from the other end, by repricing its book so its fee tracks outcomes instead of consumption. One is the diagnosis. The other is the treatment, already being administered to the largest AI engagements in the consulting world.
What connects them is the object that neither consumption metrics nor traditional billing captures: the decision trace. The record of what the agent was asked to do, what it tried, what it retried, what it cost, and whether the business result actually moved. BCG cannot bill on outcomes without that trace. Gupta cannot price marginal token utility without it. The durable artifact of this shift is not the cost report. It is the trace that ties spend to result, attempt by attempt.
Consumption dashboards were built for a SaaS world where adoption was the proxy for value. You logged in, therefore you got value. Agentic AI breaks that proxy. The agent runs whether or not it succeeds. It bills whether or not it succeeds. The only way to know if you got value is to attribute the spend to an outcome, and the only way to do that is to keep the trace.
The control layer moves
Here is the strategic consequence. Whoever owns the attribution layer owns the conversation about whether AI is working. Today that conversation is owned by the vendor, because the vendor supplies the dashboard, and the dashboard reports consumption. Consumption always goes up. The vendor’s dashboard will never tell you to spend less.
When attribution moves in-house, the conversation moves with it. The executive who can show that $40,000 of agent spend produced a measured P&L change holds a different kind of authority than the one who can only show that the meter ran. BCG is monetizing that authority directly by pricing against it. Gupta is naming it as the locus where power will concentrate. The buyer who builds it stops being a price-taker on AI.
Do this now
Stop reporting token consumption as a top-line AI metric. It is an input, not a result, and leading with it trains your whole organization to optimize the wrong number. Replace it with cost per resolved outcome: total spend on a workflow divided by the count of tasks it actually completed to a measured standard. That single ratio captures Gupta’s completion-rate math, because failed attempts inflate the numerator without adding to the denominator.
Then keep the trace. For every meaningful agent workflow, log what was asked, what it tried, what it cost, and whether the business result moved. That trace is what lets you negotiate variable-fee terms with a vendor the way BCG’s clients now negotiate with BCG. It is what lets you prove value to your own board. The cost report tells you the meter ran. The trace tells you whether your company is cooking. Build the one that answers the question that matters.
This analysis synthesizes AI Is Changing How Consultants Get Paid, BCG’s CEO Says (The Wall Street Journal, May 2026), Token Budget Wars (Jaya Gupta, Foundation Capital, May 2026).
Victorino Group helps teams build the attribution layer that ties AI spend to outcomes. Let’s talk.
All articles on The Thinking Wire are written with the assistance of Anthropic's Opus LLM. Each piece goes through multi-agent research to verify facts and surface contradictions, followed by human review and approval before publication. If you find any inaccurate information or wish to contact our editorial team, please reach out at editorial@victorinollc.com . About The Thinking Wire →
If this resonates, let's talk
We help companies implement AI without losing control.
Schedule a Conversation