Saved To My Saved Content

As companies deploy AI agents with growing autonomy, these systems will soon interact directly with customers and directly control critical business processes such as adjusting production schedules and engaging with suppliers. Such capabilities transform the impact AI can deliver but also create new risks. Organizations must move quickly to implement new governance approaches, technical capabilities, and control-by-design to manage accountability, control, and trust of AI agents.

A recent paper by researchers at Stanford and Carnegie Mellon universities highlighted the risks.1 1 Zora Zhiruo Wang et al., “How Do AI Agents Do Human Work? Comparing AI and Human Workflows Across Diverse Occupations,” arXiv, November 6, 2025. An AI agent was tasked with creating an Excel file from expense receipts but was unable to process the data. To achieve its goal, it fabricated plausible records, complete with invented restaurant names. At scale, false records like this would bring penalties for false accounting, or worse.

This example highlights the central challenges of AI agents; the governance and control challenges of AI are elevated to a new level for three reasons:

The challenge is heightened when organizations with successful agents that have limited scope take what appears to be a natural next step and give those agents increased autonomy or new capabilities. These upgrades, which may not trigger a comprehensive review, could have dramatic effects.

Organizations therefore need new thinking where AI governance includes AI risk management by design, new technical approaches for evaluation, monitoring, and assurance, and robust response plans.

Organizations are pushing ahead with AI adoption: in a global MIT Sloan Management Review/Boston Consulting Group study released in November, just 10% of organizations indicated they had handed decision-making powers to AI, but after three years respondents indicated this number should rise to 35%.

Meanwhile, incidents involving AI have increased 21% from 2024 to 2025, according to the AI Incidents Database. This indicates that as AI deployment continues, the need for risk management is increasing in parallel.

The immediate cost of insufficient governance for AI agents is painful and obvious: direct financial loss, damage to customer trust, and even legal or regulatory action. But the long-term cost may be even greater. Without strong governance, companies will lack the confidence to deploy AI agents at scale, thereby missing out on the substantial benefits this remarkable technology can deliver. (See the slideshow.)

The AI Agent Difference

Executives are beginning to understand that AI agents require a new governance approach. In the MIT Sloan Management Review/Boston Consulting Group executive survey, 69% agreed that “holding agentic AI accountable for its decisions and actions requires new management approaches.”

To build that new approach, however, it is essential to understand how AI agents differ from the co-pilot AI that many organizations have been deploying up to today. The key characteristics of an AI agent are that it observes its environment and then, based on this observation, autonomously makes a plan to achieve its defined goal. This is followed by autonomous execution of that plan using tools, APIs, or other systems to influence its environment.
Finally, the AI agent repeats this process in a learning loop until it determines that its goal has been achieved.

In contrast, much of the AI at work in organizations today operates as a co-pilot, responding to human prompts and guidance. In addition, today’s AI typically has a human in the loop who not only checks final decisions, but also shapes how the AI learns, plans, and optimizes, providing guardrails along the way.

Each of the following properties of an AI agent brings risk:

Collectively, these risks represent a step change in exposure. Agents that optimize their own goals locally may create instability across the system. Flawed behavior by one agent may spread. And independent agents may align on a single, harmful strategy, for instance, if they all rely on the same anomalous data source or exploit the same gap in their guardrails. Unlike traditional systems guided by human workers, cascading failures can emerge quickly and spread rapidly. In summary, vulnerability is moving from a contained, product-level to an ecosystem-level risk.

The New Vulnerabilities in AI Agents

While CEOs and CFOs need the high-level risk appreciation outlined in the main article, CIOs and CISOs need an extra level of understanding on the specific cyber vulnerabilities of AI agents. Each enables malicious actors to exploit one of the four components mentioned above.

Unfortunately, these new threats come with a whole new vocabulary.

State Representation Risks. These are the risks that come from AI agents having “memory” and include:

Reasoning and Decision Risks. These exploit the greater decision-making skills and are typically more direct attempts to control the agent or its ecosystem, and include:     

Action and Influence Risks. Here, malicious actors aim to exploit the connection between the AI agent and the environment it inhabits. Attacks include:

Iterative Loop Risks. Here, attackers are capitalizing on a key capability in AI agents: their ability to evolve, iterate, and cooperate, but turning it to malevolent ends. Attacks include:

An Improved Approach to AI Governance

The first line of defense is to ask: do we need an AI agent? In some cases, 95% of the benefits of an AI agent can be won through other AI technologies where governance is more straightforward and the risks can be more easily managed.

However, there are many applications when AI agents generate very clear, perhaps transformative, benefits. To mitigate the step-change in risk, organizations need a step-change in preparation. Yes, many organizations have a shiny new AI risk management program crafted just a few years (or months) ago. This can provide a firm foundation for managing the risks of AI agents, but additional work needs to be done. There are four key elements. (See the exhibit.)

The Components of a Risk Framework for AI Agents

In more detail, they are:

Build a comprehensive risk taxonomy. The first step in reducing risk is to understand it. So it is vital to categorize and prioritize agent-specific risks on a grid across technical, ethical, and operational dimensions.

Just as AI agents are integrated with the rest of the organization, this taxonomy must be integrated with existing enterprise risk frameworks to guide monitoring and mitigation.

Develop an expanded testing and evaluation infrastructure. Before deploying AI agents, it is vital to create controlled test environments that replicate real-world complexity. It is also crucial that these are not simple, one-agent sandboxes; agents will, as deployment picks up pace, start to interact with other agents, and this must be replicated in the test environment to surface issues such as coordination failures, goal drift, and unwanted emergent patterns of behavior.

To help companies deploy AI agents, cloud vendors and companies offering AI platforms are offering tools for comprehensive testing, some general-purpose, and others focused on specific, high-risk applications such as chatbots.

Only if these test environments duplicate the messy, complex real world with multiple agents operating in parallel will it be possible to see how the agents interact, coordinate, and, in some cases, compete. Once the environment is established, organizations should enforce standardized evaluation metrics for stability, quality, and compliance.

Implement ongoing monitoring. This is the most crucial step. Agents must report their activity in real time so higher-level monitoring systems can detect deviations or unwanted performance drift before damage is done, referring back to behavior data collected prior to deployment which serves as a baseline for comparison. This facilitates the fundamental shift from assessing what’s happening inside an AI agent to monitoring its activity. As the number of agents deployed rises, dashboards can report behavioral indicators, such as whether goals are shifting or whether actions are moving outside permitted ranges.

On top of this, there must be clearly defined escalation protocols for when agents step outside their expected bounds—even during the night shift. Remember: a strength of AI agents is that they don’t take time off or sleep; monitoring must also be always-on too. Some of this escalation may be safety-first and triggered as a precaution before any human review.

Design for robustness and resilience. These measures are difficult to retrofit into half-developed agents. It makes much more sense, and provides more comprehensive risk reduction, if the safety, continuity, and fallback measures are built into the design from the start. It is also essential to consider the human side of risk reduction from the outset. As AI agents increasingly drive mission-critical business systems, organizations must consider how they will stay open for business if some agents need to be taken offline due to unexpected, unwanted interactions. Think about how to contain cascading failures in real time. Also, human oversight is not an easy cure-all. It too needs careful design and patching it in during implementation, or worse deployment, is too late.

But there is also a bigger point. AI agents must be deployed in a way that aligns with organizational risk appetite. Every company needs to decide: Where are we comfortable using AI agents? Where are the no-go zones? For instance, in health care, a provider may allow AI agents to communicate with staff and patients, and access a wide range of data and systems, with patient records being out of bounds. This creates a conceptual “sandbox” for extensive innovation with AI agents that can be trusted not to leak or misuse personal information.

As companies become more comfortable with AI agents, decisions on risk appetite may be revisited and the no-go area reduced. See “Six Questions CEOs Must Ask About AI Agents.”

Six Questions CEOs Must Ask About AI Agents
  • Where do we need AI agents vs other AI technologies, and in what areas does our risk appetite allow AI agents to be deployed?
  • Do we know where AI agents operate across our business and vendor ecosystem—and how mature our governance truly is?
  • Is our governance model designed for autonomous systems, or still built for traditional AI?
  • Can we safely test and validate autonomous behaviors before they reach production?
  • Are we managing AI risk actively and continuously, or reactively at the end of the development cycle?
  • When an agent inevitably fails, are we prepared to fail safely—with built-in resilience, rapid recovery, and transparency?

A Resilient Outlook

This combination of new, unfamiliar risks may seem daunting, and it would be easy for organizations to decide that the risks posed by AI agents are not worth the potential downsides.

However, this would be a mistake. There are good reasons why, according to the MIT Sloan Management Review/Boston Consulting Group study, just two years after the technology went mainstream some 35% of organizations have adopted AI agents, with another 44% planning to deploy soon.

Understanding and managing the risks that AI agents pose allows organizations to focus on seizing the incredible opportunity these agents offer.