Everyone remembers that pivotal moment when we first saw what large language models (LLMs) and Generative AI (GenAI) could accomplish. Suddenly, the long-discussed theory of conversational, intuitive, creative AI became a reality, right there at our fingertips.
But as companies dove into testing GenAI’s potential, many came to recognize the limitations of standalone GenAI models. Context and reasoning limitations of typical LLMs can make it difficult to apply GenAI to complex, multistep workflows. As with traditional AI, hallucination and bias can create significant barriers to trust. And the creative outputs for which GenAI is celebrated require continuous human monitoring for quality and accuracy.
AI agents and multiagent AI systems are helping organizations hurdle these limitations and make the cognitive leap into a new paradigm of business process transformation and innovation.
Business executives say deeply embedding GenAI into business functions and processes is the No. 1 way to drive value from the technology.1
As people, we can understand language and creatively articulate responses. By employing specialized tools, we can amplify our physical and mental capabilities. By learning and remembering information, we avoid mistakes and improve on what we’ve already accomplished.
Language, planning, reasoning, reflection, and the ability to use tools, data and memory: These attributes are central to how AI agents work and demonstrate cognitive abilities as well.
In the realm of business, AI agents and human workers have other similarities. Both must be carefully selected, well trained and well equipped to perform their jobs. And both should be smartly deployed and consistently managed in ways that help ensure efficient, value-adding performance.
Not surprisingly then, our recommended principles of AI agent design and management echo familiar themes from organizational design and human resource management.
The following approaches can help your organization ensure efficiency and effectiveness across your agentic “workforce.” Read our full report to understand how to put these principles into action.
As any leader today knows, individual strengths are no match for team synergy. Organized and managed well, teamwork leverages and amplifies the strengths of each individual—making it possible to achieve goals that no person could do alone.
As with people, so too with AI agents. By leveraging an “agency” of role-specific AI agents, multiagent AI systems can understand requests, plan workflows, delegate and coordinate agent responsibilities, streamline actions, collaborate with humans, and ultimately validate and improve outputs.
Multiagent AI systems have the potential to impact every layer of enterprise architecture—not just automating existing processes and tasks, but also reinventing them.
By engaging with users and within workflows semantically rather than syntactically, AI agents can comprehend emerging needs and address them in novel ways that obviate traditional, rules-based processes. By continuously self-monitoring, multiagent AI systems can improve their outputs in near real time. Meantime, the shared persistent state of AI agents in a system enables them to collaborate and coordinate activities in ways that continuously streamline efficiency.
“Synergy (is) the bonus that is achieved when things work together harmoniously.” – Mark Twain
In our full report we explore how the following principles can help ensure multiagent AI systems are robust, reliable and trustworthy.
Scalable impact from multiagent systems depends on treating them as an ecosystem of capabilities instead of solutions and to develop a reference architecture that can support both business and technical delivery processes.
The essential layers of a reference architecture are shown in the chart below. In our full report, we show how this reference architecture can be put into action through an example use case common to all companies.
Layer: Interaction
Purpose: Allow users, processes and existing applications to collaborate with multiagent AI systems.
Actions for success: Develop defensive user interfaces that can anticipate and mitigate potential user errors or misuse, while guiding the multiagent system(s) to respond contextually.
Layer: Workflow
Purpose: Ensure controlled flow engineering to help agents interact with each other efficiently and in a more deterministic manner.
Actions for success: Implement value-stream analysis to monitor efficiency and effectiveness of workflows. Identify governance guardrails and touch points for human monitoring (“human in the loop”) to help reduce risks.
Layer: Agents
Purpose: Create, manage, deploy and optimize role-specific AI agents.
Actions for success: Industrialize the creation of role-specific agents. Each agent should be equipped with a fit-for-use language model, tools that augment language model capabilities, approved data, short- and long-term memory, and access to prompts.
Layer: Agent operations
Purpose: Monitor outputs and metrics to help ensure agents are functioning as expected.
Actions for success: Implement instrumentation and telemetry, along with logs, traces and metrics, to gather data about system activities. Activate alerts and dashboards to simplify performance monitoring.
Contributors to this report: Jim Rowan, Brijraj Limbad, Pradeep Gorai, Caroline Ritter, Brendan McElrone, Laura Shact