Government leaders around the world appear to be increasingly recognizing the transformative potential of artificial intelligence. But there is one problem: Realizing the value of that potential means adopting AI at scale, and government agencies may struggle to move beyond small-scale pilots. Generative AI, with its at-scale solutions still emerging, remains a work in progress.
Taking what does work and scaling up should include more than copying what companies are doing. With different incentives and risks than private industry—and, often, higher stakes for constituents—agencies may need to take a distinct path to scaling AI applications. Some government leaders are beginning to move forward in ways that are employee-driven.1
This path to scaling includes steps that agencies may find challenging. An employee-driven strategy relies on each employee having the necessary AI fluency, a level of training that will vary by their job, level, and role. And even more basically, any investment in AI should weigh the costs of implementation against the benefits to the public. This innocuous statement holds two difficult truths for government leaders: one, that a lack of technical expertise can obscure the true costs of AI making it appear either prohibitively expensive or unrealistically cheap; and two, that leaders need ways to measure the difficult-to-quantify mission outcomes of AI investments.
It’s a case that leaders may not be able to afford to wait to make. The AI future is here; agencies should take the initiative in shaping how the technology can best help fulfill the public sector’s mission.
AI’s potential impact on the public good is truly monumental, even including all the usual caveats. But it’s unlikely that an agency’s tech-aided initiatives could make an impact without reaching scale in AI adoption. This could lead to a fundamental paradox that at-scale adoption requires widespread use, but widespread use can also introduce risks that government agencies need to manage.
Some places are already thinking along these lines: Buenos Aires, for example. After introducing an app chatbot in 2019, the city government steadily expanded its capabilities, until by 2022, residents could use the bot to access social services, apply for a construction permit, or even report infrastructure in need of maintenance. By the end of that year, the bot had processed more than 58 million interactions, allowing Buenos Aires residents to access critical services 24/7.4
But scale can mean more than having a large number of users—it can mean becoming central to an organization’s operations. For example, the US Treasury Department has started using AI tools to locate potential fraud in government payments, preventing or recovering US$4 billion in improper payments—more than five times as much as in 2023. But the tools’ real power comes from breaking down data silos between other agencies. The Treasury’s “Do Not Pay” service has expanded to integrate with state unemployment agencies and the Social Security Administration’s Death Master File, adding verification capabilities and saving money across federal payment systems.5 And even greater data integration across organizational silos is underway.6
These examples underscore not only AI’s potential benefits but the fact that, as agencies worldwide move to adopt applications, more government employees should be familiar with the technology and its expanding uses. Social service workers in Buenos Aires should understand the limitations of the AI chatbot so they can be sure it outputs valid information to users; employees in state unemployment offices should have sufficient AI fluency to use the Treasury’s Do Not Pay service so that it can stop fraud but not delay legitimate payments to workers in need. To get to scale, agencies should have employees trained to work with AI and in using applications to find ways to make their daily work more efficient and accurate.
But government work is sensitive, and scaling AI should include managing risks of deploying AI widely in public arenas. Giving workers broad access to applications—especially those based in gen AI—can quickly uncover new areas in which AI could help agencies fulfill their mission, but it can dramatically increase risks. New AI solutions might fail to work as intended; far worse, they could generate inaccurate information and direct employees or constituents to make decisions based on that information. The question for agencies is how to get widespread, bottom-up adoption while controlling for risks.
Our research on AI adoption in government suggests strategies that can help with this paradox: Giving the right access to the right workers with the right skills, and measuring mission outcomes to stay on track.
Unlike commercial enterprises in which leadership often drives AI scaling, line employees in government may be poised to unearth transformative AI use cases. But to find and develop those use cases, workers should have access to the appropriate tools and level of training for their roles and occupations. Access to gen AI continues to be an issue, with only 1% of government respondents in a recent Deloitte survey reporting that 60% of their workforce had access to gen AI tools.7 This is beginning to change, albeit slowly, as more governments give their workers wider access to gen AI tools.
Over a six-month period, the Digital Transformation Agency of Australia deployed an AI assistant to more than 7,600 employees across some 60 government agencies to assess gen AI’s impact on employee efficiency and productivity. The initiative aimed to understand the efficiencies that the tool could bring to daily work—and identify opportunities for more tailored solutions. In the initial pilot, the tool saved each employee an average of one hour per day on administrative tasks.8
While government access to tools continues to grow slowly, leaders can move to address the AI-fluency issue. Not every worker needs to know how to fine-tune a large language model or build their own chatbot. While a few users should have detailed knowledge on how to build AI tools, others can benefit from mid-level knowledge on how to select tools, while still others need only basic knowledge on how to use tools. Our research suggests that a build-choose-use paradigm for AI fluency varies with a worker’s occupation, level, and role. Occupations with higher exposure to gen AI—and higher potential exposure—should have more knowledge to take advantage of the technology’s availability. Similarly, managers likely need more AI fluency than entry-level workers who may just need to know how to use tools in a finite number of situations. Of course, those whose roles involve creating AI tools to enhance current processes and potential new ones should have more skill than those in technical- or end-user roles (figure 1).
These tiers of AI fluency can help the workforce gain the right level of AI knowledge for them. This can be important when setting expectations for what the experience of working with AI will be like. Expectations can be high for near-magical experiences, and one bad experience with AI can stall further experimentation for a whole organization. So making progress on AI depends on having the right fluency—both in terms of skills and expectations—for each individual.
Singapore’s experience illustrates the direct relationship that investing in workforce AI fluency can have on scaling AI. In December 2023, the Singapore government revised its 2019 AI strategy, introducing National AI Strategy 2.0, emphasizing reskilling and upskilling the workforce for an AI-driven future and building the necessary infrastructure to support a thriving AI industry.9 That focus on equipping the workforce with the right skills has been important in the nation’s efforts to scale AI from the bottom up. One example is the “AI Trailblazers” initiative, which aimed to develop 100 generative AI applications in 100 days, with both government entities and businesses gaining access to third-party toolkits.10
With the right access to tools and the right skills, government workers can find new opportunities for AI to help benefit the public at an unprecedented pace and scale.
Even when good AI use cases emerge, constrained budgets can stall progress toward scaling. Our research suggests that while government leaders are as, or even more, eager to invest in AI than commercial counterparts, they struggle to measure the impact AI is having on their agencies (figure 2).
For AI projects to compete budget battles, they should have to have clear understanding of the cost benefit. AI-fluency training can help government leaders uncover the often-opaque costs associated with choosing and operating an AI model. But on the benefit side, government leaders often struggle. Unable to cite the quantitative metrics of sales and profit that commercial companies use, government leaders may need to work harder to define and measure mission impact. In the initial phases of AI adoption, when projects focus on increasing efficiency, metrics of time and dollars saved may help. But as agencies mature in their AI journey toward more impactful use cases, they could increasingly find a need for clear metrics for mission performance.
Having a solid understanding of the ways in which gen AI can create value for an agency is imperative: It can help indicate where to look for results and which metrics to choose to measure them. The pathway to value determines the types of metrics necessary to define success and even the relative levels of governance required.
These pathways to value can help with more than selecting metrics: they can also serve as signposts in selecting levels of AI governance. The more public impact that a gen AI use case has, the more validation and continuous oversight it needs. It’s not whether a gen AI use case involves direct public interaction since some internal processes such as benefits adjudication can have a major impact on constituents’ lives. So, understanding whether a use case is automating for efficiency or aiming to improve the outcome of the whole workflow can—when weighed alongside whether it is back-office or mission-focused—help determine how much oversight a particular use case needs.
Understanding how AI creates value can help select not only the right level of governance but which metrics can effectively measure success. For example, New Jersey’s state government has launched an AI implementation strategy, centered on the “NJ AI Assistant” platform. Because this AI assistant touches public services, it should have significant governance. To maintain security protocols and data sovereignty, employees operate it exclusively on state infrastructure.11 And because the AI assistant aims to improve mission effectiveness, leaders measure its success in terms of mission improvements. The New Jersey Department of Labor has seen a 35% acceleration in resident response rates through AI-enhanced communications, while the Division of Taxation’s AI-powered call center analysis has boosted successful call resolution by 50%.12
To help reach scale of adoption that can bring such significant benefits, government leaders should have the required strategies and tools to execute them.
Ms. Alexis Bonnell, chief information officer and director of the Digital Capabilities Directorate of the Air Force Research Laboratory, Department of the Air Force16
As the Air Force Research Laboratory (AFRL) chief information officer, I’ve learned that scaling AI is not just about the technology—it’s fundamentally about serving our people. Digital transformation is human transformation. The biggest issue we encounter at AFRL is the tendency to focus on the technology, the AI model, or the tool, while neglecting its purpose—to complement our incredible team. Our discussions must begin with people, thinking about how technology can enhance their capabilities.
When the AFRL introduced NIPRGPT, an experimental gen AI research platform, the focus was not only on experimenting and learning about the technology, security, and infrastructure but on human-machine teaming. Learning not only what people used AI for, but also what the adoption journey of a new technology looks like, and how they could help people find an aspirational vision of themselves as complemented by the technology—it was AI enabling you, not AI versus you.
The broader narrative around AI often paints it as overly technical and complex, inadvertently suggesting that people are not smart or ethical enough to handle it. Our approach to AI adoption and scaling starts from a point of trust. We understand that people are capable and have navigated every previous technological wave. This trust-first approach is fundamentally different. It really boils down to saying to people, “You’re enough. You’ve got this. We trust you.”
This approach to AI isn’t just about taglines; it changes how we approach the process of adopting AI. We have delineated the AI journey into four stages: “ta-da,” “uh-oh,” “ah-ha,” and “ho-hum.”
At the ta-da stage, we make sure you have a tool and AI-101 training, which focuses on the “why AI, why now, and why you?” Then, in the uh-oh stage, we make sure they have access to experts and others like them who can discuss concerns or best approaches. A key element of helping people move quickly on this journey is role-based training. Once we get past the basics, in the ah-ha stage, we focus on what the tool can really do for them. AI is an incredible tool because of its intimacy—it will be used differently by each person. In role-based training, we tell people, “Here is what you can do with the tool.” For example, if you are in public affairs, here is what public affairs can do with the tool. A legal intelligence leader training will be different from manager training. In each of these role-based trainings, we answer, “What’s in it for me? And how do I get an advantage?” Once people see that others just like them are using the tool, it reframes the thinking from “This tool threatens my job,” to “Not using this tool threatens my promotion.” They understand that colleagues are using these tools and to survive and thrive, AI is a catalyst to their future opportunities—not a roadblock.
An important outcome of this approach is that when we start with people, we allow them to organize their relationship with the technology. This inverts the typical power dynamic of technology. When someone gets to use AI on their terms, they become true curators, with intimate knowledge of their role, their mission, and their information. They start to adapt their process, rearchitecting work around AI, driving change in their work better than any centralized executive could. In this way, they proactively optimize their mission, ultimately ensuring a better return on taxpayer dollars.
We are entering an era where every person is a technologist, and that is an incredible opportunity. As a CIO, my job is to figure out how to get people moving on their AI journey and embracing themselves and their human-machine teaming potential. The happiest day for me is when my users say, “What’s next?”