Saved To My Saved Content

As enterprises adapt to the costs of cloud AI, there’s been an important revelation: cloud AI economics are workload-specific, not provider-specific. Moreover, platform selection is not a single procurement decision but three linked choices—model, architecture, and control-plane—each with different implications for cost, lock-in, and governance.

The leaders that own these cloud AI decisions—CEOs setting strategic direction, CFOs modeling AI economics, and CIOs and CTOs designing the architecture and operating model—must not conflate these choices or optimize simply based on headline token prices. Those who do will systematically over-spend, under-govern, or both.

Benchmarking for Pricing Insights

According to BCG’s Nimbus Pricing Index, core cloud pricing remains remarkably steady. We expect it will remain so because hyperscalers need a consistent cash flow for AI related capital investments. (See Exhibit 1.)

Flat Core Compute Pricing Sustains Cash Flows as Hyperscalers Scale AI Infrastructure

As AI use cases scale from pilots to mainstream workloads, the relevant comparison shifts from cost per technical unit to cost per business outcome. The larger challenge for enterprise buyers is not price volatility; it is price comparability.

Across AI workloads, providers use different billing meters: text is billed by input and output tokens, speech by audio duration, and vision by image or feature unit. Multimodal workflows combine these meters, making token-to-token comparisons structurally incomplete. Even within token-based workloads, effective cost per outcome varies with tokenizer efficiency, prompt construction, context length, and output size. Organizations should therefore normalize spend-to-business relevant-units—cost per 1,000 summaries, per 10 hours of audio, or per 5,000 image captions—rather than relying on headline token rates. (See Exhibit 2.)

"Token" Is Not a Universal Cost Unit Across AI Workloads
Monthly Newsletter Subscription
Tech + Us: Harness the power of technology and AI

Studying Price Comparability

To better understand price comparability, we benchmarked three practical AI workloads—NLP summarization, image captioning, and speech-to-text—across AWS, Google Cloud, and Azure using each provider’s managed cloud-native AI services. We used frontier LLMs for summarization, a vision-plus-LLM workflow for image captioning, and managed transcription for speech—applying standardized prompts, common sample sets, and managed services throughout.

It’s important to note that we did not intend to declare one cloud service provider (CSP) the lowest-cost option overall. Results are sensitive to sample characteristics, output lengths, and implementation details. Exhibit 3 shows that cost leadership shifts by workload even when prompts, sample sets, and output expectations are held constant. That is the core procurement implication: provider-level generalizations can create a real risk of overpaying. (See “Methodology Note.”)

Methodology Note: Cloud Services vs. Model Performance
Where a cloud service provider offers a specific foundation model as default—for example, Gemini on Google Cloud or Claude via Amazon Bedrock—observed cost differences reflect both the platform’s pricing structure and the model’s tokenization efficiency. The four-times cost gap in NLP summarization between Google Cloud and AWS captures this combined effect. Enterprises seeking to separate model from platform economics should conduct additional model-level benchmarks; these findings are most useful for procurement at the cloud platform level.

Cloud AI Comparisons Show Cost Leadership Shifts by Workload

This benchmarking exercise offers several critical cost insights.

Selecting a CSP for Agentic AI

The workload-specific economics of AI forces enterprises to use different criteria to select CSPs than they have in the past. What used to be a procurement decision—negotiated on headline pricing and master-agreement leverage—is now a strategic platform decision that shapes innovation velocity, governance control, and cost economics at scale.

In fact, agentic AI changes what “platform” means. The operating model shifts from individual assistants to teams of collaborating agents, distributing work across routing, retrieval, generation, and review. (See Exhibit 4.) Given that the value contribution from agentic AI is expected to double by 2028, enterprises will need stronger orchestration reliability, governed access to enterprise systems, persistent memory, and production-grade monitoring. (See Exhibit 5.)

AI Will Shift from "Individual Assistants" to a "Team of Collaborating Agents"
Value from Agentic AI Is Expected to Double by 2028

CSP Selection Is Three Decisions—Not One

Enterprise AI platform decisions span three linked but distinct choices. The first is model choice—which foundation models or managed model experiences to use for priority tasks. Second, workload architecture—whether to use modular pipelines, such as separate vision and text components, or integrated multimodal approaches. Third, the enterprise platform or control-plane choice—where to accept lock-in on orchestration, memory, observability, guardrails, and connectors.

One of the most common and costly mistakes in enterprise AI procurement is conflating these three decisions, treating a model pricing difference as a platform verdict, or a platform lock-in decision as a model swap.

Model Choice

Model choice determines the core economics of inference: tokenizer efficiency, context window, output length, latency, and quality for a specific task. Enterprises should not select a model garden by brand alone; they should map priority workloads to the smallest model capable of meeting quality thresholds, then monitor the agentic cost variables that drive production spend. Enterprises will find that hyperscalers are finding ways to differentiate their core AI model offerings. (See Exhibit 6.)

Hyperscalers Differentiate Their Agentic AI Stacks with Cost and Control Trade-offs

Workload Architecture

An enterprise agentic AI capability is a layered stack, not a single product purchase. A modular architecture clarifies what must be in place while minimizing lock-in. The platform view is organized into six layers:

The practical question is: Which platform supports the full stack with the least custom build while preserving strategic flexibility? To address this question, leaders need to align on the strategy and then consider the four platform options for enterprise AI, which vary by cost, complexity, lock-in, and governance. (See Exhibit 7.)

Four Platform Options for Enterprise AI

Worth noting is that each CSP introduced adds ~30% in operational overhead—identity and access management (IAM), networking, security, skills, governance—so there must be a material net benefit at the workload and platform levels to justify adding one.

Enterprise Platform or Control-Plane Choice

Cost and risk fragment quickly when every business unit builds these controls independently. Companies must centrally house four control-plane elements to create cross-platform visibility, governance, and reuse:

Where an organization centralizes these components determines its controls posture, cost model, and operating model readiness. (See Exhibit 8.)

Where the Key Components for Centralization Must Be Housed to Maximize Strategic and Security Advantages

Applying the Three Decisions: What Leaders Should Test

The decisions concerning model choice, workforce architecture, and enterprise platforms/control plane help leadership evaluate each possible platform against a set of diligence tests. These tests are not a separate framework; they connect workload economics to platform readiness and help CXOs make the trade-offs explicit.

Modeling the Cost-of-Ownership Decision

Agentic AI introduces a cost profile that behaves differently from traditional cloud workloads—and even from single-turn generative AI. The CFO-ready view must separate one-time stand-up costs from run costs, and explicitly model the variables that drive scaling economics. (See Exhibit 9.)

The cost framework shows that executives should track and manage “agentic cost variables”—the measurable drivers of economic performance and predictability in production deployments.

Agentic AI Cost Variables - What to Measure, Model, and Monitor

A Phased Path to Value—Scaling Without Losing Control

A phased approach builds enterprise architecture while enabling business unit adoption. (See Exhibit 10.) This avoids accelerating into production before governance and repeatability are in place:

Platform decisions are easiest to reverse early—and hardest to reverse once skills, operating models, and governance are embedded.

Phased Approach Provides Runway for Development While Supporting Business Unit Adoption

Given the pace of agentic AI adoption, CSP selection needs to shift from a vendor debate to a governed enterprise strategy. The economic case for AI is compelling. The right decision starts with workload-level evidence, then makes deliberate choices about model, workload architecture, and control plane. Enterprises that centralize governance and track cost per outcome will move faster without losing financial discipline; those that optimize only on headline token prices risk spending too much and governing too little.