Managing Director & Partner
Silicon Valley - Bay Area
Artificial intelligence is shaking up business as usual, enabling corporate functions from marketing and sales to finance to deliver predictive insights at extraordinary speed and scale. Given the possibilities, it’s little wonder that adoption of AI across the enterprise has attracted enormous interest and investment.
But like the cobbler who resoles everyone else’s shoes before fixing his own, IT organizations have been slower to employ machine learning (ML) technologies within their own functions. That may soon change. In the past few years, a number of vendors have begun designing and developing powerful analytics tools to address the particular challenges that IT personnel face in managing, updating, and running IT hardware and software across the enterprise.
The market for AI in IT operations (or AIOps, as it is known) is new, but its potential is huge. A BCG survey of 112 CIOs across multiple industries found that these technologies could significantly improve the cost-effectiveness and performance of IT operations, allowing organizations to drive innovation more quickly without sacrificing security, stability, or service.
It may be too early for AIOps to fully replace manual IT operations activities, but that should not lull CIOs and IT directors into thinking that they can sit back and wait. While observers and analysts may spill ink over whether this emerging technology is worthy of the hype, leading IT organizations have already resolved that question and are moving decisively down the path toward implementing AIOps tools and developing the capabilities to use them. The question isn’t whether AIOps will revolutionize IT operations; it’s how long the rest of the field will take to recognize that the shift is already underway.
IT operations can feel like a thankless job. Teams may have the knowledge and experience to be a strategic partner to the business, but demonstrating value can be a Sisyphean task. The daily grind of chasing down alerts and patching problems can lock IT personnel into a cycle in which they are continually playing catch-up instead of preventing problems from arising. Increasing use of the cloud can alleviate some of these issues, but it doesn’t make the operational complexity go away: someone still has to manage those cloud services and organizational interconnections. With workloads rapidly growing and with no consistent, effective way to prioritize activities, IT operations are constantly on the back foot, perpetuating a stereotype that the function is reactive and slow to move—the very perceptions that IT functions have tried so hard to shake.
IT operations must deal with a number of key challenges:
These demands are growing at a time when IT operations budgets are under increasing pressure.1 Notes: 1 Gartner’s IT Key Metrics 2019 report indicates that, on average, data centers, service desks, and voice and data networks commanded 35% of the total IT budget in 2018, down from 46% in 2014. The only way for IT leaders to deliver the stability and cost-effectiveness that their budgets demand is to make their operations more predictive, proactive, and automated.
AIOps is a new field, and a confusing one. Only in 2017 did Google begin receiving a significant volume of Google search requests for “AIOps,” and most vendors are still refining their solution set. For clarity, we define AIOps as comprising all solutions that use big data, AI, and ML to enhance and automate IT operations and monitoring.2 Notes: 2 By BCG’s definition, AIOps includes all artificial intelligence (AI) and machine learning (ML) in application performance monitoring (APM), IT infrastructure monitoring (ITIM), network performance monitoring and diagnostics (NPMD), and IT event correlation and analysis (ITEC&A). Our definition excludes AI and ML in areas such as experience management operations, cybersecurity operations, and delivery automation. (See Exhibit 1.)
Within the IT operations and monitoring space, AIOps is most suitable for application performance monitoring (APM), information technology infrastructure management (ITIM), network performance monitoring and diagnostics (NPMD), and information technology event correlation and analysis (ITEC&A), where it can help automate routine manual operations activities. That automation potential has prompted a surge in development. The market for core AIOps is projected to grow from $9.4 billion in 2017 to $13.8 billion in 2021, a compound annual growth rate of 10%.3 Notes: 3 Areas encompassed by core AIOps include APM, ITIM, NPMD, and ITEC&A. AIOps orchestrators—platforms built to orchestrate insight and actions on the basis of log data from various monitoring solutions—are expected to grow by 26% over the same period. (See Exhibit 2.)
As the underlying technologies become more established, AI-enabled tools and platforms will be able to automate core monitoring and management activities at scale, relieving many of the most pressing IT operations challenges quickly, reliably, and consistently.
We believe that AIOps will help transform IT operations in three critical ways:
With sufficient refinement, AIOps may someday be able to automate a significant portion of all IT operations and monitoring activities.
So far, only the most progressive IT organizations have implemented AIOps solutions, but adoption is likely to grow significantly over the next three to five years. More than 40% of CIOs say that they plan to begin using AIOps solutions by the end of 2021, up from about 15% who indicated they were using AIOps as of January 2019, when the survey was conducted. Other CIOs are watching the space but intend to wait for the tools to mature further.
As with most other emerging technologies, solution development will occur in stages. Use cases that focus on providing visibility and insight are likely to mature first, with more sophisticated execution capabilities added over time.
User adoption will follow the technology maturity curve. (See Exhibit 3.) Of the roughly nine core use cases that the AIOps vendor community is actively building, the ones that have attracted the greatest interest among early adopters focus on pattern recognition and data consolidation. By contrast, few CIOs have indicated that they are ready to embrace truly bleeding-edge use cases, such as ones that rely on machine-generated recommendations to fully automate issue remediation.
In our view, three use cases hold the greatest short-term potential:
In the medium term, several other use cases hold strong promise. These include root cause analysis tools, which are designed to help IT personnel understand the issues triggering different incidents and alerts and allow them to prioritize remediation; incident prediction tools, which can identify causal patterns between IT variables; and automated issue remediation, which has the potential to automatically address routine problems and events—for instance, by adjusting IT configurations or workflows.
Although few vendors have developed a full suite of enterprise-grade AIOps tools as yet, several categories of vendors are staking claims in the burgeoning AIOps market. They include incumbent monitoring players, entrants from adjacent tool markets, emerging AI/ML challengers, and hyperscalers. (See Exhibit 4.)
That crowded playing field is good for innovation. But it also creates more work for CIOs and other potential customers in the meantime, since they need to sift through multiple tools and options to develop a working knowledge of vendor capabilities. Over the next several years, though, we’re likely to see considerable vendor consolidation, which should make selection and management less of a chore. That consolidation will be due to three factors:
AIOps is still in its early stages, but it won’t stay that way for long. It seems likely that, five years from now, all top-performing IT organizations will have adopted some form of AIOps—with third-party tools or with hyperscalers’ natively provided tools—achieving advantages in cost, service, and stability over slower-moving peers.
CIOs have less time to prepare for AIOps than they may think. The tasks of selecting the right vendor and identifying the right use cases are challenging enough; but beyond that, getting ready for AIOps internally can be a heavy lift. AIOps requires structured access to significant amounts of historical and real-time IT operations data, as well as dedicated resources to enable data access and configure the AIOps tools. In addition, successfully automating the resulting insights and actions will demand significant process reengineering to connect the dots end-to-end and ensure that the right operational handoffs are in place.
Organizations that want to capture maximum upside should start by actively sponsoring a cross-functional team, pinpointing one or two high-value use cases, and assembling the right mix of data engineering, data science, and development talent to test and refine the appropriate tools. Anticipating risk, legal concerns, compliance issues, and other considerations and securing ongoing engagement from these functions are also crucial, not only to reduce potential business exposure, but also to help build confidence and trust among non-IT-based functions in the quality and efficacy of AI-based risk modeling and detection.
Training and change management are also essential. AIOps is a very different way of working, and it will require organizations to retrain some IT operations teams and redeploy others. Teams must learn to work with AIOps tools through relevant, labeled examples, operationalize them, and understand their output. And CIOs must ensure that the development effort has a sufficient budget and firm leadership commitment, with funding and reviews based on achieving predefined business outcomes and not simply development milestones.
While AIOps may look like a “watch this space” opportunity, CIOs and IT leaders who sit back now risk being unable to capture the full benefit of this technological revolution later, when full-scale implementation goes mainstream.