Skip to main content

AI tokenomics: A CFO’s guide to governing the AI P&L

How finance leaders could evaluate AI’s impact on their P&L, capital allocation, and operating expense—before margins move.

As AI adoption accelerates, tokens—not licenses or head counts—are becoming the true unit of cost. Understanding AI tokenomics gives CFOs a new lens to manage margins, forecast risk, and help C-suites make smarter capital decisions.

What CFOs need to know about AI economics

AI consumption is scaling fast with implications that extend far beyond the remit of technology organizations.1 Over the past seven years, AI infrastructure has exceeded historical benchmarks for major US infrastructure expansions, including the interstate highway system and the dot-com era. Enterprise adoption of AI reflects this infrastructure's acceleration: as of Q2 2025, nearly three-quarters of executives reported investing in AI within the previous 12 months. And AI has become one of the largest incremental technology expenses for many organizations, with 50% of leaders reporting they’re spending 21%–50% of their digital transformation budgets on AI.2

For CFOs, this is not simply another IT investment cycle. AI represents a structural shift in cost behavior—one that introduces new sources of volatility into operating expense, margins, forecasts, and capital planning. The challenge, therefore, is not whether to invest in AI but whether finance leaders can see, measure, and govern its economics early enough to shape outcomes.

As one investment management CFO sees it, “Most organizations are facing the same pressure to get tech expenses under control to free up budget for AI spend, no questions asked."

He continues, "even though productivity conversations are still the primary focus for many organizations, there's clearly a perceived first-mover advantage even though revenue conversations haven't started, yet. Most organizations don't yet understand how to calculate the total cost of ownership behind AI intelligence.”3

Why AI breaks traditional cost management across all models

Most CFOs are accustomed to managing technology spend through familiar levers: licenses, head counts, infrastructure capacities, and depreciation schedules. AI does not conform to those models.

AI costs can be usage-driven, nonlinear, and highly variable. While many organizations are still shielded from direct AI costs—as they're often today embedded within enterprise resource planning (ERP), customer relationship management, human resources, or SaaS platforms—that insulation appears temporary. Agentic capabilities are already shifting pricing models away from per-seat licensing toward usage-based, outcome-based, or value-based constructs, with cost models expected to shift over time.

The result: Material implications for forecasting accuracy and margin control. For organizations running AI applications on cloud, token operating costs are visibly metered and already appearing as an unforecasted expense for some organizations with higher AI token volumes driven by scaling and complexity.

How to understand your own demand and how this could change

At the core of AI economics is a new unit of consumption: the token. Every AI interaction whether text, image, code, or decision consumes tokens. Both inputs (prompts, context, data retrieval) and outputs (responses, actions) are metered. Each token has a price that varies by model, infrastructure, configuration, and market conditions. This matters because tokens—not users, licenses, or time—can drive cost.

As AI adoption scales:

  • Token volumes increase with more users, richer data, and more autonomous workflows.
  • Costs accelerate nonlinearly as models grow more complex.
  • Spend becomes harder to predict without explicit demand management.

What drives token growth and what are the implications? 

An organization’s number of monthly tokens generated reflects a combination of scale, prompting behaviors (e.g., character length of the prompt, number of interactions), and model complexity.

A move toward AI-everything and digitized applications across the enterprise will fuel token growth, requiring finance and tech leaders to better understand the impacts on data center infrastructure. Decisions related to investing in GPU capacity, colocation space, and power/cooling ultimately impact opex and capital expense (capex) tokenomics.

A single “simple” conversational interaction can generate thousands of tokens. At scale, millions of tokens can quickly become billions—or trillions—per month. In one example from our work with a large health care enterprise, token usage growth of 8%–10% per month (1 trillion tokens over six months) translated into more than $6 million in annualized, previously unplanned cost increases before the finance team had visibility into the driver behind it.

Token ranges are vast and directly grounded in engineering choices, system complexity, and user behaviors
Token ranges are vast and directly grounded in engineering choices, system complexity, and user behaviors

Every prompt generates input and output tokens

~1,500 words ≈ 2,048 tokens generated, therefore, an iterative conversation may generate 5,000 tokens,4 including input and output tokens5

A single user can generate millions of tokens when inferencing (using an AI solution in production)

9.4 million tokens per year per subscriber for a basic chatbot and as much as 356 million tokens per year per user for super agents6

Tokens /month vary dramatically, so cost will as well

One Seattle-based software startup reported nearly 1 billion tokens consumed in 30 days, introducing unpredictable costs tied to volatile usage and pricing structures.7 Google processed 480 trillion tokens per month in 2025 across its products and APIs—up 50 times from the same time the year prior8

Tokens function like variable input costs in a manufacturing model, except demand is harder to constrain. Growing token volumes are a natural consequence of AI scaling across a set of variables: users, data modes, agentic workflow design, prompting hygiene/guardrails, and many other factors. Technology leaders are only starting to get a handle on these dynamics, with the largest-scale adopters feeling the impacts first. For example, AT&T scaled AI with solutions in use across more than 100,000 employees plus worker agents. The impact: 8 billion tokens a day. The company then implemented a multiagent system with super agents overseeing worker agents, leading to 90% cost savings while tripling token volumes to 27 billion per day within months.9

The implications also vary across hosting models (such as SaaS, cloud-based, or hosted solutions), which all influence the unit pricing of tokens. These factors may fluctuate over time as programs scale, making it challenging for CFOs to forecast potential increases in expenditures.

 

Figure 1. Cost per AI token as usage scales across modalities

Total cost of ownership per million tokens
 

Source: Deloitte, "Navigate the economics of AI: How tokenomics is reshaping AI costs and ROI," January 2026.

Note: Given token costs are obfuscated and variable across SaaS providers, we’ve limited the modeling to cloud-based and self-hosted options only.

 

Deloitte’s AI token economics analysis highlights that unit economics shift meaningfully at scale. As token volumes grow, different deployment models become more or less cost-effective:

  • While token prices may be obfuscated in some SaaS solutions, providers of those solutions have started to integrate token metering capabilities.10
  • At lower volumes, cloud application programming interfaces (APIs) offer flexibility despite higher unit costs.
  • At midscale, alternative cloud or “neocloud” options can improve economics.
  • At sustained high volumes, AI factories or self-hosted models may deliver the lowest unit cost—but require upfront capital investment.

Based on a Deloitte AI Infrastructure 2028 survey of 550 US enterprise leaders, the amount of AI tokens that companies are burning through—and paying for—is already high and set to rise quickly. Our survey data suggests that many companies today already generate above 10 billion tokens per month. Moreover, the proportion of respondents that expect to be generating more than 100 billion per month is projected to triple from 2025 to 2028.

Many CFOs may start to ask, “At what point do I intervene? Which lever do I pull? What breaks if I ignore this?" CFOs don’t always choose architectures, but they do often set economic constraints, thresholds, and escalation points. Also, there can be ramifications for CFOs where these token economics map to CFO accountabilities.

Why getting a hold on AI token dynamics can have real ramifications on the CFO 

For CFOs, unmanaged token dynamics can create four immediate risks:

AI spend behaves like a variable input cost, not a fixed overhead. As adoption rises, forecast error can increase unless token consumption is modeled explicitly. Forecast models may need to update substantially where AI cost-for-build scenarios may equal (avg. tokens/user × user volume × cost/token) × (some model mix factor). If usage and scaling are less settled and more variable, token budgeting guidelines may be needed.

Token costs are often distributed across SaaS contracts, cloud bills, or professional services invoices. Margin pressure emerges gradually—often only after scale is locked in—so by the time it shows up in financial results, options may be limited. Leaders can get ahead of margin leakage by speaking with SaaS providers about token management terms when contracting, asking for visibility into their metered usage to better understand their current token footprint. 

If cloud-hosting AI workloads, a tighter management of opex may be needed with greater scrutiny of user adoption and behaviors, solution design, and finops. Based on the strategy and scale in some cases private hosting may make sense. In each case, the cost is not going away, so leaders need an approach to validate their strategy and manage the finances.

“It’s only a matter of time. Investment management has healthier margins than many other industries that could already be feeling the other side of the coin, not just to spend but the pressure to carry that spend," one surveyed CFO said. "Eventually, AI becomes like market data for financial services organizations, an expense they need to carry as part of the cost of doing business.”

Without early visibility, CFOs can be forced into reactive capex decisions—such as self-hosting or specialized infrastructure—after token volumes are already committed. To help CFOs be proactive not reactive, Seun Salami, CFO of TIAA Nuveen champions the importance of a strong, collaborative relationship between finance and technology leaders. “Please befriend your CFO” he implored tech leaders early in 2026. “To secure approval for transformative technology investments, CIOs should understand not only how the business drives revenue but also the sources of risk and true competitive differentiation.”11

Analysts may ask where AI return on investment shows up. If finance cannot clearly articulate how AI investment translates into revenue uplift, cost-to-serve reduction or productivity gains—net of token costs—the earnings story can weaken.

What tokenomics ultimately means for CFOs

The implication for CFOs is not that one model is “right,” but that economic inflection points can be predictable if token demand is understood early. CFOs do not need to track every token, but they do need to ensure that spend increases are grounded in a strategy and TCO is seen as an investment input across a clear value chain. They should govern token economics with the same rigor applied to capital allocation that's guided by three key principles:

If AI is important to your organization, the key question is whether it is being deployed in the places that matter most. Too often, AI strategies are driven by fear of missing out rather than by business value. The priority should be strategic investments. Sometimes that means using large language models or embedding AI into the ERP to create new capabilities or optimize high-value processes. But making the right choices often requires an enterprisewide perspective. AI is not a business-unit issue—it's an enterprise issue, and investments need to match the organization’s ambitions. That strategy then becomes the throughline to assess which applications are strategic for continued funding and which may not be relative to the application footprint and corresponding investment.

AI only matters if it translates into shareholder value. One way to evaluate the potential impact is by looking across the value chain and deciding where the organization wants to create, capture and own value. For companies making large AI investments, token economics impacting total cost of ownership, operating model decisions, and infrastructure strategies can become central strategic issues. For CFOs, consider where AI can create the greatest strategic and economic impact. For example, when using AI agents in the tech function, one 15-person startup spent $2,000 in AI tokens to build three workflow tools with approximately 300,000 lines of code. The company concluded that the output was worth the investment and decided to adopt the AI coding platform more broadly. Cloud tech company Vercel's CEO Guillermo Rauch said that its biggest token spenders are also its most productive employees, and a $10,000 day's token spend could likely saved millions for the company given enhanced IT delivery.12 The cost-to-benefit dynamics need to make sense.

That starts with understanding what the organization is funding, what is driving usage, and what the underlying economics look like. Many companies today are still signing checks without a clear view of cost, consumption or returns. CFOs should consider changing that. AI may require a dedicated governance function, and should also include strong finops capabilities, cost discipline and financial sustainability oversight. Any AI tool use should also introduce new key performance indicators that measure value across data enablement, productivity, revenue growth, new business models, and talent outcomes. To understand its costs, Kumo AI, a 60-person startup, tracks token usage per engineer and considers the investment a strategic R&D expense.13

As Salami of TIAA Nuveen said, “From the front desk to the boardroom, every investment should empower organizations to anticipate market shifts, differentiate, and deliver meaningful outcomes for clients and shareholders.”14

AI programs need to be given an equally hard look to determine which programs are strategic and which may not be. If AI scaling is deemed strategic, these token dynamics will need to be CFO-managed and factored into forecasts, margins, capital risk projections, and much more. This requires a tight partnership with business and tech leaders to validate the strategic expense and understand total cost of ownership as an investment input to ROI. It also requires managing costs over time, whether that means making a strategic investment in private hosting or exploring new financial management approaches like chargebacks.

AI is not just a technology investment—it is an economic system. Left unmanaged, it introduces volatility, margin pressure, and capital risk. Governed well, it can become a lever for sustained enterprise value. The CFO who can connect tokens to the P&L through foresight, discipline, and scenario modeling will not only control costs but shape long-term competitive advantage.

Contact us

Want to explore how to optimize your token usage and reduce AI costs? Get in touch with our team to discuss strategies tailored to your workloads.

Endnotes

1. Diana Kearns-Manolatos, "As cloud costs rise, hybrid solutions are redefining the path to scaling AI", Deloitte, November 5, 2025. 

2. Tim Smith, Gregory Dost, Gharima Dhasmana, Parth Patwari, Diana Kearns-Manolatos, Iram Parveen, "AI is capturing the digital dollar. What’s left for the rest of the tech estate?", Deloitte, October 15, 2025. 

3. CFO of an investment management firm, phone interview with the Deloitte team, January 2026.

4. OpenAI, OpenAI Help Center, "What are tokens and how to count them?", updated January 2026, accessed March 25, 2026.  

5. Nicholas Merizzi, Tim Smith, Nitin Mittal, Gaurav Churiwala, Diana Kearns-Manolatos, "AI tokens: How to navigate AI’s new spend dynamics", Deloitte, January 19, 2026.

6. Alistair Barr, "Is the tech industry ready for AI 'super agents'?", Business Insider, April 2, 2025, accessed March 27, 2026. 

7. CFO of an investment management firm, phone interview with the Deloitte team, February 2026.

8. Sundar Pichai, "Google I/O 2025: From research to reality," Google, Sundar Pichai’s opening keynote as transcribed, May 20, 2025, accessed March 25, 2026. 

9. Venturebeat, “8 billion tokens a day forced AT&T to rethink AI orchestration,” February 26, 2026. accessed March 25, 2026. 

10. Felix Bartler, SAP, "Monitor Token Usage with SAP Generative AI Hub," SAP Community blog post, January 10, 2025, accessed March 25, 2026. 

11. Anjali Shaikh, Steve Pratt and Megan Turchi, "Nuveen CFO on elevating CIO-CFO Strategic Relationships," The Wall Street Journal, CFO Journal, January 30, 2026; WSJ content sponsored by Deloitte, accessed March 25, 2026.    

12. Katherine Bindley, "You’ve Finally Figured Out AI at Work—Now Comes the Bill," The Wall Street Journal, March 17, 2026, accessed March 25, 2026.  

13. Bindley, "You’ve Finally Figured Out AI at Work—Now Comes the Bill."  

14. Shaikh, Pratt and Turchi, "Nuveen CFO on elevating CIO-CFO Strategic Relationships." 

Did you find this useful?

Thanks for your feedback