Artificial intelligence systems are increasingly capable of acting as AI agents, autonomously planning and acting in pursuit of a goal with little or no human oversight, according to the first international report on the safety of artificial intelligence.
The report – led by Université de Montréal computer science professor Yoshua Bengio (photo at right) – says the most advanced AI systems in the world now have the ability to write increasingly sophisticated computer programs, identify cyber vulnerabilities, and perform on a par with human PhD-level experts on tests in biology, chemistry and physics.
"The capabilities of general-purpose AI have increased rapidly in recent years and months,” Bengio, founder and scientific director of Mila – Quebec AI Institute, said in a statement.
“While this holds great potential for society, AI also presents significant risks that must be carefully managed by governments worldwide,” he said.
“This report by independent experts aims to facilitate constructive and evidence-based discussion around these risks and serves as a common basis for policymakers around the world to understand general-purpose AI capabilities, risks and possible mitigations.”
The report, announced in November 2023 at the AI Safety Summit at Bletchley Park, England, and inspired by the workings of the United Nations Intergovernmental Panel on Climate Change, consolidates leading international expertise on AI and its risks.
This final International AI Safety Report follows an interim publication in May 2024.
Supported by the United Kingdom’s Department for Science, Innovation and Technology, Bengio, who’s also a Canada CIFAR AI Chair, led a team of 96 international experts in drafting the report.
The experts were drawn from 30 countries, the United Nations, the European Union and the OECD. Their report will help inform discussions at the AI Action Summit in Paris, France on February 10 and 11 and serve as a global handbook on AI safety to help support policymakers.
As policymakers worldwide grapple with the rapid and unpredictable advancements in AI, the report contributes to bridging the gap by offering a scientific understanding of emerging risks to guide decision-making.
The document sets out the first comprehensive, independent and shared scientific understanding of advanced AI systems and their risks, highlighting how quickly the technology has evolved.
In fact, since the writing of the report in December last year and its publication in January 2025, AI developer OpenAI shared early test results from a new AI model, OpenAI o3. These results indicate significantly stronger performance than any previous model on a number of the field’s most challenging tests of programming, abstract reasoning and scientific reasoning, Bengio noted.
“In some of these tests, [OpenAI] o3 outperforms many (but not all) human experts. Additionally, it achieves a breakthrough on a key abstract reasoning test that many experts, including myself, thought was out of reach until recently.”
Generating evidence on the safety and security implications of the trends implied by o3 will be an urgent priority for AI research in the coming weeks and months, Bengio said.
Research urgently needed in several areas
Several areas require urgent research attention, according to the report, including how rapidly capabilities will advance, how general-purpose AI models work internally, and how they can be designed to behave reliably.
Three distinct categories of AI risks are identified:
General-purpose AI makes it easier to generate persuasive content at scale. This can help actors who seek to manipulate public opinion, for instance to affect political outcomes.
General-purpose AI also can make it easier or faster for malicious actors of varying skill levels to conduct cyberattacks.
Current AI systems have demonstrated capabilities in low- and medium-complexity cybersecurity tasks, and state-sponsored actors are actively exploring AI to survey target systems.
New research has confirmed that the capabilities of general-purpose AI related to cyber offence are significantly advancing, but it remains unclear whether this will affect the balance between attackers and defenders.
General-purpose AI systems can amplify social and political biases, causing concrete harm. They frequently display biases with respect to race, gender, culture, age, disability, political opinion or other aspects of human identity.
This can lead to discriminatory outcomes including unequal resource allocation, reinforcement of stereotypes and systematic neglect of underrepresented groups or viewpoints.
General-purpose AI R&D is currently concentrated in a few Western countries and China. This “AI divide” has the potential to increase much of the world’s dependence on this small set of countries, the report says. Some experts also expect it to contribute to global inequality.
Growing compute use in general-purpose AI development and deployment has rapidly increased the amounts of energy, water and raw material consumed in building and operating the necessary compute infrastructure.
This trend shows no clear indication of slowing, despite progress in techniques that allow compute to be used more efficiently.
A small number of companies currently dominate the market for general-purpose AI. This market concentration could make societies more vulnerable to several systemic risks.
The report places particular emphasis on the urgency of increasing transparency and understanding in AI decision-making as the systems become more sophisticated and the technology continues to develop at a rapid pace.
While there are still many challenges in mitigating the risks of general-purpose AI, the report highlights promising areas for future research and concludes that progress can be made.
Key findings of the report
The report’s key findings are:
Recent research suggests that rapidly scaling up models may remain physically feasible for at least several years. But major capability advances may also require other factors: for example, new research breakthroughs, which are hard to predict, or the success of a novel scaling approach that companies have recently adopted.
For example, according to recent estimates, state-of-the-art AI models have seen annual increases of approximately four times in computational resources (compute) used for training and 2.5 times in training dataset size.
If recent trends continue, by the end of 2026 some general-purpose AI models will be trained using roughly 100 times more training compute than 2023's most compute-intensive models, growing to 10,000 times more training compute by 2030, combined with algorithms that achieve greater capabilities for a given amount of available computation.
Rapid capability advancement makes it possible for some risks to emerge in leaps; for example, the risk of academic cheating using general-purpose AI shifted from negligible to widespread within a year. The more quickly a risk emerges, the more difficult it is to manage the risk reactively and the more valuable preparation becomes.
Companies often share only limited information about their general-purpose AI systems, especially in the period before they are widely released. Companies cite a mixture of commercial concerns and safety concerns as reasons to limit information sharing.
However, this information gap also makes it more challenging for other actors to participate effectively in risk management, especially for emerging risks.
In some circumstances, competitive pressure may incentivize companies to invest less time or other resources into risk management than they otherwise would.
Similarly, governments may invest less in policies to support risk management in cases where they perceive trade-offs between international competition and risk reduction.
For example, U.S. President Donald Trump has already rolled back AI safety and environmental regulations while threatening protectionist measures.
More risks emerging but techniques to manage risks are limited
Other key findings in the report are:
There is broad consensus among researchers that advances regarding the following questions would be helpful:
AI does not happen to us: choices made by people determine its future, the report notes. “The future of general-purpose AI is uncertain, with a wide range of trajectories appearing possible even in the near future, including both very positive and very negative outcomes.”
This uncertainty can evoke fatalism and make AI appear as something that happens to us. But it will be the decisions of societies and governments on how to navigate this uncertainty that determine which path we will take.
General-purpose AI has immense potential for education, medical applications and research advances in fields such as chemistry, biology or physics, and generally increased prosperity thanks to AI-enabled innovation.
“How general-purpose AI gets developed and by whom, which problems it gets designed to solve, whether societies will be able to reap general-purpose AI’s full economic potential, who benefits from it, the types of risks we expose ourselves to, and how much we invest into research to manage risks – these and many other questions depend on the choices that societies and governments make today and in the future to shape the development of general-purpose AI.”
R$
Organizations: | Mila – Quebec AI Institute and Université de Montréal |
People: | Yoshua Bengio |
Topics: | International AI Safety Report |