- AI agents – tools that automate complex tasks – are having a resurgence due to the advent of large language models (LLMs).
- The development and deployment of more complex AI systems requires better safety and governance processes.
- The World Economic Forum and Capgemini have published a new white paper, Navigating the AI Frontier: A Primer on the Evolution and Impact of AI Agents, which explores the capabilities and implications of these smart assistants.
The concept of artificial intelligence (AI) agents was born in the 1990s as intelligent software entities operating autonomously on behalf of users, navigating environments like the internet, which at the time was quite simple: text, sparsely interspersed with some images, along with hyperlinks. There were no decent search engines, so substituting human browsing with AI agents browsing on their behalf seemed to be useful, and the state of AI was such that there was hope for them only in such simplified environments, as opposed to the complex real world of humans.
Have you read?
With the advent of large language models (LLMs) and their power of natural language understanding and reasoning, there has been a resurgence of interest in AI agents. Companies such as Salesforce and ServiceNow are starting to announce specialized agents that represent their applications. For example, Salesforce is using agents to automate CRM by automating tasks, analysing data and personalizing customer interactions. These modern AI agents use LLMs to understand and respond to natural language, and to decide when and how to call tools, such as web search, code execution, API calls and data retrieval commands.
Safety and governance processes for AI agents
Given their role, a level of autonomy is expected of an agent as it interacts with the world using its tools. In some cases, like running code in a container, the consequences can be controlled and contained. In others, a level of scrutiny and care needs to be applied to avoid undesirable consequences of an agent’s autonomous behaviour. The fact that the “brain” of an agent is an inherently opaque neural network (i.e. an LLM) makes this more complex. LLM-based agents are also prone to hallucinations or misunderstanding of inherently ambiguous natural language communications, so what we gain in robustness we can lose in consistency.
Several mitigation strategies are available to reduce these risks, including implementing rules for overriding or seeking human approval for certain agent decisions, assessing uncertainty of agent behaviour and pairing agents with safeguard agents that monitor and prevent potential harm.
Given this required oversight, we need a different testing regime for agent-based systems that differs from what we are used to in traditional software. The good news, however, is that we are used to testing such systems as we have been operating human-driven organizations and workflows since the dawn of industrialization.
The state of the art in generative AI models prevents single agents from effectively taking a large body of complex instructions and carrying out too many varied and complex tasks. Given context window restrictions as well as limitations in the power of reasoning of LLMs, it is often more effective to break responsibilities down into multiple AI agents, connected and coordinating to get work done.
Just as we do not assign a single software engineer the task of writing a fully-fledged CRM system, asking an AI agent to write code for a full CRM system may be a tall order, but having a team of agents representing various responsibilities such as project management, front-end and back-end engineering, and quality assurance, working together is more likely to get the job done successfully.
The role of multi-agent AI
The field of multi-agent AI calls for agents that can communicate and coordinate. This can be in the context of a team of agents, which implies collaboration, or it may be across teams of agents with varying levels of trust and even adversity.
A multi-agent system representing software or an organization’s various workflows can have several interesting advantages, including improved productivity, operational resilience, improved robustness and the ability for faster upgrades of different modules. Businesses are rapidly moving to adopt single-agent solutions, and multi-agent systems seem to be an inevitable and quite disruptive future state.
LLMs understand natural language, and so, out of the box, we have a universal protocol for inter-agent communications. Because of the flexibility of natural language to express intent, and the ability of LLM-based agents to map such intent to the manner by which their tools are called, software systems of the future will be much less brittle than the ones we have today: an agent responsible for a functionality takes care of mapping intent to the specific API calls, and so it will be much less of a hassle to change the API, or employ a different software with similar capabilities. This will give organizations more flexibility in upgrades or with third-party services.
https://www.weforum.org/stories/2025/01/ai-agents-multi-agent-systems-safety/