Digital marketers have long relied on people‑based A/B tests, in‑platform lift studies, and randomized‑controlled trials (RCTs) to gauge the incremental impact of media. But each approach comes with its own practical limits.
In response, geo-based testing frameworks were developed. They use markets/geos, not individuals, as experimental units. These approaches sidestep user-level tracking, exploit natural sales variability, and allow marketers to run heavy-up or hold-out tests across both digital and offline channels.
Traditional geo-based causal designs assume that each market can be treated as a self-contained system, with minimal overlap in media reach, audience movement, or consumer interactions between test and control regions. In the U.S., this assumption largely holds true within Designated Market Areas (DMAs), where media delivery and audience behavior are geographically aligned. But in countries like Canada, where daily commuting across cities blurs those lines, that assumption breaks down.
Case Study
Our latest project with a leading Canadian bank pushed the boundaries of incrementality testing. We set out to run statistically rigorous geo-based tests in a country where clear DMA boundaries don’t exist and where commuting patterns blur the lines between cities.
Toronto’s surrounding region exemplifies the challenge. Cities like Mississauga, Brampton, and Vaughan share overlapping economic and behavioral ecosystems. Many residents live in one city, work in another, and shop in both. In such environments, traditional geo-based randomization risks spillover, where treatment effects “leak” into control areas, thus undermining causal clarity.
The central innovation of this work was to adapt causal inference design to this non-standard geography. We excluded Toronto itself to prevent spillover, since its high commuter inflow and outflow could distort test-control isolation. Instead, we grouped adjacent suburbs such as Mississauga, Brampton and Vaughan, Waterloo, and Kitchener and Cambridge into coherent market clusters based on commuting intensity. This allowed us to maintain relative independence between markets while still maintaining statistical power.
Methodology
To operationalize this approach, we deployed
Media Optimization
, BCG’s proprietary platform for running geo-based matched-market experiments. It supports three key steps:
- Creating matched-markets
- Running power analysis to determine test budget and duration
- Launching the test and measuring incremental impact
At the heart of the methodology is matched-market creation, where we performed city groupings to run incrementality tests for the bank. The tests enabled us to determine whether the causal design would hold. The overarching goal was to ensure that the test and control groups are nearly identical across all relevant dimensions, business performance, demographics, and geographic characteristics, while also being representative of the national landscape.
This careful design serves two critical purposes:
- Ensuring test integrity: High parity between test and control markets allows for clean experimentation and accurate causal measurement.
- Enabling scalability: When both clusters mirror the national profile, insights from the test can be confidently generalized to a broader rollout.
We use a
greedy algorithm
to construct matched markets, iteratively selecting candidate geographies that maximize overall similarity to the control cluster. Each test market must meet a stringent set of guardrails to ensure comparability:
- Primary KPI alignment: Pearson correlation of ≥95% between control and test clusters on the primary KPI (typically sales)
- National sales share parity: Share of national sales within ±5 basis points (bps) across clusters
- Secondary KPI consistency: Tight correlation on additional signals such as store traffic or online sessions
- Product mix parity: Distribution across product categories within ±5 bps
- Channel mix parity: Comparable distribution across sales channels (e.g., online vs. retail)
- Demographic alignment: Alignment on key demographic variables relevant to the business
- Geographic dispersion: Geographic spread of clusters sufficient to mitigate regional biases
Custom Guardrails for the leading Canadian bank
Because this engagement focused on the bank’s Insurance Line of Business (LOB), we introduced additional validations to that context:
- Demographic parity: Markets were balanced on insurance-relevant variables such as household income, marital status, commute time, and gender split.
- Channel mix: The client operated across online and brick-and-mortar (B&M) channels. We ensured that the share of digital vs. in-person sales remained consistent between test and control groups.
- Branch capacity: Since most sales were driven through physical branches, we validated that staffing and vacancy rates per branch were comparable across clusters.
- Product mix: The bank offered both auto and home insurance, so we checked that each market had a similar product-type split.
Validating Experimental Integrity
Finally, we verified that performance metrics were statistically indistinguishable between clusters prior to test launch:
- Primary KPI: A composite metric defined as sales × customer lifetime value (LTV), showing ≥ 95% correlation between test and control groups.
- Secondary KPIs: Early funnel indicators, such as application starts and completions, confirmed consistent conversion patterns.
This rigorous pre-test validation ensured that any observed differences during the test could be confidently attributed to media treatment, not pre-existing variance.
The Take‑Away
Within one quarter, the bank ran multiple concurrent geo-based experiments, a process that had previously taken a year or more. Each experiment produced clean, incremental lift estimates, allowing marketing and analytics teams to identify winning strategies, scale them, and move to the next test with confidence. The result was not just faster optimization: It was institutionalized causal learning. The client evolved from sequential pilots to a high-velocity, test-and-learn culture grounded in scientific inference rather than intuition.