As AI adoption accelerates, tokens—not licenses or head counts—are becoming the true unit of cost. Understanding AI tokenomics gives CFOs a new lens to manage margins, forecast risk, and help C-suites make smarter capital decisions.
AI consumption is scaling fast with implications that extend far beyond the remit of technology organizations.1 Over the past seven years, AI infrastructure has exceeded historical benchmarks for major US infrastructure expansions, including the interstate highway system and the dot-com era. Enterprise adoption of AI reflects this infrastructure's acceleration: as of Q2 2025, nearly three-quarters of executives reported investing in AI within the previous 12 months. And AI has become one of the largest incremental technology expenses for many organizations, with 50% of leaders reporting they’re spending 21%–50% of their digital transformation budgets on AI.2
For CFOs, this is not simply another IT investment cycle. AI represents a structural shift in cost behavior—one that introduces new sources of volatility into operating expense, margins, forecasts, and capital planning. The challenge, therefore, is not whether to invest in AI but whether finance leaders can see, measure, and govern its economics early enough to shape outcomes.
As one investment management CFO sees it, “Most organizations are facing the same pressure to get tech expenses under control to free up budget for AI spend, no questions asked."
He continues, "even though productivity conversations are still the primary focus for many organizations, there's clearly a perceived first-mover advantage even though revenue conversations haven't started, yet. Most organizations don't yet understand how to calculate the total cost of ownership behind AI intelligence.”3
Most CFOs are accustomed to managing technology spend through familiar levers: licenses, head counts, infrastructure capacities, and depreciation schedules. AI does not conform to those models.
AI costs can be usage-driven, nonlinear, and highly variable. While many organizations are still shielded from direct AI costs—as they're often today embedded within enterprise resource planning (ERP), customer relationship management, human resources, or SaaS platforms—that insulation appears temporary. Agentic capabilities are already shifting pricing models away from per-seat licensing toward usage-based, outcome-based, or value-based constructs, with cost models expected to shift over time.
The result: Material implications for forecasting accuracy and margin control. For organizations running AI applications on cloud, token operating costs are visibly metered and already appearing as an unforecasted expense for some organizations with higher AI token volumes driven by scaling and complexity.
At the core of AI economics is a new unit of consumption: the token. Every AI interaction whether text, image, code, or decision consumes tokens. Both inputs (prompts, context, data retrieval) and outputs (responses, actions) are metered. Each token has a price that varies by model, infrastructure, configuration, and market conditions. This matters because tokens—not users, licenses, or time—can drive cost.
As AI adoption scales:
An organization’s number of monthly tokens generated reflects a combination of scale, prompting behaviors (e.g., character length of the prompt, number of interactions), and model complexity.
A move toward AI-everything and digitized applications across the enterprise will fuel token growth, requiring finance and tech leaders to better understand the impacts on data center infrastructure. Decisions related to investing in GPU capacity, colocation space, and power/cooling ultimately impact opex and capital expense (capex) tokenomics.
A single “simple” conversational interaction can generate thousands of tokens. At scale, millions of tokens can quickly become billions—or trillions—per month. In one example from our work with a large health care enterprise, token usage growth of 8%–10% per month (1 trillion tokens over six months) translated into more than $6 million in annualized, previously unplanned cost increases before the finance team had visibility into the driver behind it.
|
Every prompt generates input and output tokens |
~1,500 words ≈ 2,048 tokens generated, therefore, an iterative conversation may generate 5,000 tokens,4 including input and output tokens5 |
|
A single user can generate millions of tokens when inferencing (using an AI solution in production) |
9.4 million tokens per year per subscriber for a basic chatbot and as much as 356 million tokens per year per user for super agents6 |
|
Tokens /month vary dramatically, so cost will as well |
One Seattle-based software startup reported nearly 1 billion tokens consumed in 30 days, introducing unpredictable costs tied to volatile usage and pricing structures.7 Google processed 480 trillion tokens per month in 2025 across its products and APIs—up 50 times from the same time the year prior8 |
Tokens function like variable input costs in a manufacturing model, except demand is harder to constrain. Growing token volumes are a natural consequence of AI scaling across a set of variables: users, data modes, agentic workflow design, prompting hygiene/guardrails, and many other factors. Technology leaders are only starting to get a handle on these dynamics, with the largest-scale adopters feeling the impacts first. For example, AT&T scaled AI with solutions in use across more than 100,000 employees plus worker agents. The impact: 8 billion tokens a day. The company then implemented a multiagent system with super agents overseeing worker agents, leading to 90% cost savings while tripling token volumes to 27 billion per day within months.9
The implications also vary across hosting models (such as SaaS, cloud-based, or hosted solutions), which all influence the unit pricing of tokens. These factors may fluctuate over time as programs scale, making it challenging for CFOs to forecast potential increases in expenditures.
Source: Deloitte, "Navigate the economics of AI: How tokenomics is reshaping AI costs and ROI," January 2026.
Note: Given token costs are obfuscated and variable across SaaS providers, we’ve limited the modeling to cloud-based and self-hosted options only.
Deloitte’s AI token economics analysis highlights that unit economics shift meaningfully at scale. As token volumes grow, different deployment models become more or less cost-effective:
Based on a Deloitte AI Infrastructure 2028 survey of 550 US enterprise leaders, the amount of AI tokens that companies are burning through—and paying for—is already high and set to rise quickly. Our survey data suggests that many companies today already generate above 10 billion tokens per month. Moreover, the proportion of respondents that expect to be generating more than 100 billion per month is projected to triple from 2025 to 2028.
Many CFOs may start to ask, “At what point do I intervene? Which lever do I pull? What breaks if I ignore this?" CFOs don’t always choose architectures, but they do often set economic constraints, thresholds, and escalation points. Also, there can be ramifications for CFOs where these token economics map to CFO accountabilities.
For CFOs, unmanaged token dynamics can create four immediate risks:
The implication for CFOs is not that one model is “right,” but that economic inflection points can be predictable if token demand is understood early. CFOs do not need to track every token, but they do need to ensure that spend increases are grounded in a strategy and TCO is seen as an investment input across a clear value chain. They should govern token economics with the same rigor applied to capital allocation that's guided by three key principles:
As Salami of TIAA Nuveen said, “From the front desk to the boardroom, every investment should empower organizations to anticipate market shifts, differentiate, and deliver meaningful outcomes for clients and shareholders.”14
AI programs need to be given an equally hard look to determine which programs are strategic and which may not be. If AI scaling is deemed strategic, these token dynamics will need to be CFO-managed and factored into forecasts, margins, capital risk projections, and much more. This requires a tight partnership with business and tech leaders to validate the strategic expense and understand total cost of ownership as an investment input to ROI. It also requires managing costs over time, whether that means making a strategic investment in private hosting or exploring new financial management approaches like chargebacks.
AI is not just a technology investment—it is an economic system. Left unmanaged, it introduces volatility, margin pressure, and capital risk. Governed well, it can become a lever for sustained enterprise value. The CFO who can connect tokens to the P&L through foresight, discipline, and scenario modeling will not only control costs but shape long-term competitive advantage.
Endnotes
1. Diana Kearns-Manolatos, "As cloud costs rise, hybrid solutions are redefining the path to scaling AI", Deloitte, November 5, 2025.
2. Tim Smith, Gregory Dost, Gharima Dhasmana, Parth Patwari, Diana Kearns-Manolatos, Iram Parveen, "AI is capturing the digital dollar. What’s left for the rest of the tech estate?", Deloitte, October 15, 2025.
3. CFO of an investment management firm, phone interview with the Deloitte team, January 2026.
4. OpenAI, OpenAI Help Center, "What are tokens and how to count them?", updated January 2026, accessed March 25, 2026.
5. Nicholas Merizzi, Tim Smith, Nitin Mittal, Gaurav Churiwala, Diana Kearns-Manolatos, "AI tokens: How to navigate AI’s new spend dynamics", Deloitte, January 19, 2026.
6. Alistair Barr, "Is the tech industry ready for AI 'super agents'?", Business Insider, April 2, 2025, accessed March 27, 2026.
7. CFO of an investment management firm, phone interview with the Deloitte team, February 2026.
8. Sundar Pichai, "Google I/O 2025: From research to reality," Google, Sundar Pichai’s opening keynote as transcribed, May 20, 2025, accessed March 25, 2026.
9. Venturebeat, “8 billion tokens a day forced AT&T to rethink AI orchestration,” February 26, 2026. accessed March 25, 2026.
10. Felix Bartler, SAP, "Monitor Token Usage with SAP Generative AI Hub," SAP Community blog post, January 10, 2025, accessed March 25, 2026.
11. Anjali Shaikh, Steve Pratt and Megan Turchi, "Nuveen CFO on elevating CIO-CFO Strategic Relationships," The Wall Street Journal, CFO Journal, January 30, 2026; WSJ content sponsored by Deloitte, accessed March 25, 2026.
12. Katherine Bindley, "You’ve Finally Figured Out AI at Work—Now Comes the Bill," The Wall Street Journal, March 17, 2026, accessed March 25, 2026.
13. Bindley, "You’ve Finally Figured Out AI at Work—Now Comes the Bill."
14. Shaikh, Pratt and Turchi, "Nuveen CFO on elevating CIO-CFO Strategic Relationships."