The 2026 International AI Safety Report, chaired by Turing Award winner Yoshua Bengio and produced by more than 100 experts from over 30 countries, delivers a message that should reshape how enterprises think about risk. The most pressing dangers from artificial intelligence, the report concludes, come not from the models themselves but from the complex systems companies build around them.
The report, backed by the OECD, the EU and the United Nations and covered by IBM Think, marks a sharp departure from last year’s edition. The 2025 report concentrated on model behavior: hallucinations, bias, benchmark failures. This year’s zeroes in on what happens after deployment, when AI systems trigger business processes, access sensitive data, make autonomous decisions and interact with other systems in ways their operators may not fully understand.
Kush Varshney, an IBM researcher who served as a reviewer on the report, pointed to one finding that should get the attention of every enterprise CTO. The report describes what it calls “jagged” capability growth, a pattern in which AI systems make sudden leaps in some domains while remaining unreliable or brittle in others. Leading systems can now solve International Mathematical Olympiad problems and complete coding tasks that would have taken a human programmer hours. Yet those same systems stumble at counting objects in an image, reasoning about physical space and recovering from basic errors during longer workflows.
Varshney told IBM Think that this jaggedness argues for a different engineering approach. “I think this highlights why enterprises should consider the paradigm of generative computing, where individual AI calls are grounded through modular verification,” he said. “Taking that approach can make the overall system reliable and consistent.”
Francesca Rossi, IBM Global Leader for Responsible AI and AI Governance, called the shift from model-level to system-level thinking the report’s most significant development. “AI safety is no longer mainly a model issue, but rather a system and deployment issue,” she told IBM Think. “AI systems aren’t just generating text now. They are influencing decisions, triggering processes, accessing data and interacting with other systems. That means safety must draw from disciplines like cybersecurity, risk management and safety engineering, not just model evaluation.”
The scale of adoption underscores the stakes. According to the report, AI is one of the fastest-adopted consumer technologies in history. Agentic AI systems, which can plan, pursue goals and interact with external tools autonomously, pose heightened risks because they act without waiting for human approval at each stage. Rossi said failures now tend to happen between components rather than inside any single model. “Governance has to extend beyond the model lifecycle into system design and management,” she said. “A nominal ‘human-in-the-loop’ approach is not enough. If humans are overloaded or lack the right information, oversight becomes symbolic.”
Compounding the problem, pre-deployment safety testing itself has become less reliable, according to the report. Varshney said the field needs to respond. “We need to shift from static evaluation and alignment to dynamic steerability,” he said. “We should also focus less on universal definitions of harmfulness and more on context-specific, scoped notions of harm that respect sovereignty and the diverse needs of users around the world.”
The report also warns that AI is lowering the barrier to sophisticated hacking. AI systems can discover software vulnerabilities and write malicious code. Criminal groups and state-associated attackers are actively using general-purpose AI in their operations. Dawn Song, a Professor of Computer Science at UC Berkeley who contributed to the report, sees 2025 as a turning point. “Year 2025 marked a step change in frontier AI capabilities in cybersecurity,” Song told IBM Think. Through research efforts including CyberGym and BountyBench, Song’s team at Berkeley has demonstrated that AI can find zero-day vulnerabilities in large-scale, widely distributed open-source software. The researchers launched the Frontier AI Cybersecurity Observatory for continuous monitoring and recently published a paper that promotes using AI for automatic theorem proving and verifiable code generation with provable guarantees.
Some experts said the report underplays a critical piece of AI safety: the internal dynamics of organizations. Rossi argued that the organizational dimension of AI safety is underrepresented. “Enterprise AI safety is not just a socio-technical issue, but also an organizational challenge,” she said. “Organizations need to make decisions about incentives, skills and sustained commitment under business pressure.” She added that companies prioritizing safety are finding that it strengthens stakeholder trust, reduces downstream risks and creates sustainable value over time.
Varshney believes the report also underplays a wider philosophical risk. “I think it could have said more about the potential loss in human agency if we don’t do AI right,” he said. “We have to make sure that we don’t create a ‘helicopter-parent’ AI that stifles the human-centric goals it was meant to support.”
Susan Leavy, an Assistant Professor at University College Dublin and a senior adviser on the report, said the most widely-used AI algorithms are often optimized for objectives that are misaligned with human values, such as engagement metrics that drive platform usage at the expense of users. “Along with increased capabilities of AI, there has been a dramatic increase in AI adoption, and we need to safeguard human autonomy very carefully,” Leavy told IBM Think. She warned that the greater danger may not be a dramatic loss of control, but the slow normalization of dependency. “Humanity losing control of algorithms is unfolding in a much more mundane way, and rather than a hostile takeover, we seem to cede autonomy voluntarily,” she said.
Among the report’s more immediate concerns is the rise of agreeable AI. Balaraman Ravindran, who heads the Department of Data Science and Artificial Intelligence at IIT Madras and contributed to the report, zeroed in on sycophancy, the tendency of AI chatbots to agree with users rather than challenge them. “I have been most surprised by the emotional effect that the phenomenon of sycophancy has had,” Ravindran told IBM Think. “I would expect people to distrust someone who agrees with them all the time, but it appears that people, especially emotionally vulnerable persons, become more suggestible in such circumstances.” Ravindran said the finding has shifted his own thinking on regulation. “Perhaps, we need some legally mandated guardrails for chatbots released for general use and, of course, those that are serving any counseling or mental health purpose.”
According to the report, twelve companies published or updated their own frontier AI safety frameworks in 2025, outlining how they assess and mitigate risks from advanced models. Even so, most risk management efforts remain voluntary.
The key challenge for enterprises is no longer purely technical. “The key challenge now is not only building capable and well-aligned models, but ensuring that complex and possibly agentic AI systems are governed, monitored and accountable in real-world enterprise environments over time,” Rossi said. The report makes clear that the safety conversation has moved from the lab to the data center floor, where the systems that fail are the ones enterprises build themselves.