Geo-Based Incrementality Testing at Scale: Insights from a Leading Bank

By Vishu Agarwal and Jerry Wang
Blog Post

Digital marketers have long relied on people‑based A/B tests, in‑platform lift studies, and randomized‑controlled trials (RCTs) to gauge the incremental impact of media. But each approach comes with its own practical limits.

Limitations of traditional testing methods

In response, geo-based testing frameworks were developed. They use markets/geos, not individuals, as experimental units. These approaches sidestep user-level tracking, exploit natural sales variability, and allow marketers to run heavy-up or hold-out tests across both digital and offline channels.

Traditional geo-based causal designs assume that each market can be treated as a self-contained system, with minimal overlap in media reach, audience movement, or consumer interactions between test and control regions. In the U.S., this assumption largely holds true within Designated Market Areas (DMAs), where media delivery and audience behavior are geographically aligned. But in countries like Canada, where daily commuting across cities blurs those lines, that assumption breaks down.

Case Study

Our latest project with a leading Canadian bank pushed the boundaries of incrementality testing. We set out to run statistically rigorous geo-based tests in a country where clear DMA boundaries don’t exist and where commuting patterns blur the lines between cities.

Toronto’s surrounding region exemplifies the challenge. Cities like Mississauga, Brampton, and Vaughan share overlapping economic and behavioral ecosystems. Many residents live in one city, work in another, and shop in both. In such environments, traditional geo-based randomization risks spillover, where treatment effects “leak” into control areas, thus undermining causal clarity.

The central innovation of this work was to adapt causal inference design to this non-standard geography. We excluded Toronto itself to prevent spillover, since its high commuter inflow and outflow could distort test-control isolation. Instead, we grouped adjacent suburbs such as Mississauga, Brampton and Vaughan, Waterloo, and Kitchener and Cambridge into coherent market clusters based on commuting intensity. This allowed us to maintain relative independence between markets while still maintaining statistical power.

Methodology

To operationalize this approach, we deployed Media Optimization , BCG’s proprietary platform for running geo-based matched-market experiments. It supports three key steps:

Matched market approach leverages highly correlated groupings of markets to enable incrementality measurement

At the heart of the methodology is matched-market creation, where we performed city groupings to run incrementality tests for the bank. The tests enabled us to determine whether the causal design would hold. The overarching goal was to ensure that the test and control groups are nearly identical across all relevant dimensions, business performance, demographics, and geographic characteristics, while also being representative of the national landscape.

This careful design serves two critical purposes:

  1. Ensuring test integrity: High parity between test and control markets allows for clean experimentation and accurate causal measurement.
  2. Enabling scalability: When both clusters mirror the national profile, insights from the test can be confidently generalized to a broader rollout.

We use a greedy algorithm to construct matched markets, iteratively selecting candidate geographies that maximize overall similarity to the control cluster. Each test market must meet a stringent set of guardrails to ensure comparability:

Custom Guardrails for the leading Canadian bank

Because this engagement focused on the bank’s Insurance Line of Business (LOB), we introduced additional validations to that context:

Validating Experimental Integrity

Finally, we verified that performance metrics were statistically indistinguishable between clusters prior to test launch:

This rigorous pre-test validation ensured that any observed differences during the test could be confidently attributed to media treatment, not pre-existing variance.

The Take‑Away

Within one quarter, the bank ran multiple concurrent geo-based experiments, a process that had previously taken a year or more. Each experiment produced clean, incremental lift estimates, allowing marketing and analytics teams to identify winning strategies, scale them, and move to the next test with confidence. The result was not just faster optimization: It was institutionalized causal learning. The client evolved from sequential pilots to a high-velocity, test-and-learn culture grounded in scientific inference rather than intuition.