Delivering on the promise of AI in government

Government leaders recognize AI's potential, but scaling it requires unique strategies, workforce training, and balancing costs against public benefits

Costi Perricos

United Kingdom

Edward Van Buren

United States

Vishal Kapur

United States

Joe Mariani

United States

 

Government leaders around the world appear to be increasingly recognizing the transformative potential of artificial intelligence. But there is one problem: Realizing the value of that potential means adopting AI at scale, and government agencies may struggle to move beyond small-scale pilots. Generative AI, with its at-scale solutions still emerging, remains a work in progress.

Taking what does work and scaling up should include more than copying what companies are doing. With different incentives and risks than private industry—and, often, higher stakes for constituents—agencies may need to take a distinct path to scaling AI applications. Some government leaders are beginning to move forward in ways that are employee-driven.1

This path to scaling includes steps that agencies may find challenging. An employee-driven strategy relies on each employee having the necessary AI fluency, a level of training that will vary by their job, level, and role. And even more basically, any investment in AI should weigh the costs of implementation against the benefits to the public. This innocuous statement holds two difficult truths for government leaders: one, that a lack of technical expertise can obscure the true costs of AI making it appear either prohibitively expensive or unrealistically cheap; and two, that leaders need ways to measure the difficult-to-quantify mission outcomes of AI investments.

It’s a case that leaders may not be able to afford to wait to make. The AI future is here; agencies should take the initiative in shaping how the technology can best help fulfill the public sector’s mission.

Key challenges

  • Limited AI expertise. In a recent Deloitte survey, only one-sixth of government leaders surveyed believed that their organization had high or very high gen AI expertise compared to 32% to 56% in industries.2 This can make navigating the technical choices around model selection, cost, and data cleaning, storage, and security difficult.
  • Inverted interest in AI. While much commercial interest in gen AI is top-down, with line-of-business executives promoting the technology, the picture is inverted for government: Workers may be eager to try gen AI while leaders are wary of risks.
  • Lack of access to gen AI. Despite some bottom-up interest in AI adoption, only 1% of government leaders surveyed said that more than 60% of the workers in their organization had access to gen AI—a number that’s orders of magnitude lower than commercial peers.3

Trend in action

The scaling paradox

AI’s potential impact on the public good is truly monumental, even including all the usual caveats. But it’s unlikely that an agency’s tech-aided initiatives could make an impact without reaching scale in AI adoption. This could lead to a fundamental paradox that at-scale adoption requires widespread use, but widespread use can also introduce risks that government agencies need to manage.

Some places are already thinking along these lines: Buenos Aires, for example. After introducing an app chatbot in 2019, the city government steadily expanded its capabilities, until by 2022, residents could use the bot to access social services, apply for a construction permit, or even report infrastructure in need of maintenance. By the end of that year, the bot had processed more than 58 million interactions, allowing Buenos Aires residents to access critical services 24/7.4

But scale can mean more than having a large number of users—it can mean becoming central to an organization’s operations. For example, the US Treasury Department has started using AI tools to locate potential fraud in government payments, preventing or recovering US$4 billion in improper payments—more than five times as much as in 2023. But the tools’ real power comes from breaking down data silos between other agencies. The Treasury’s “Do Not Pay” service has expanded to integrate with state unemployment agencies and the Social Security Administration’s Death Master File, adding verification capabilities and saving money across federal payment systems.5 And even greater data integration across organizational silos is underway.6

These examples underscore not only AI’s potential benefits but the fact that, as agencies worldwide move to adopt applications, more government employees should be familiar with the technology and its expanding uses. Social service workers in Buenos Aires should understand the limitations of the AI chatbot so they can be sure it outputs valid information to users; employees in state unemployment offices should have sufficient AI fluency to use the Treasury’s Do Not Pay service so that it can stop fraud but not delay legitimate payments to workers in need. To get to scale, agencies should have employees trained to work with AI and in using applications to find ways to make their daily work more efficient and accurate.

But government work is sensitive, and scaling AI should include managing risks of deploying AI widely in public arenas. Giving workers broad access to applications—especially those based in gen AI—can quickly uncover new areas in which AI could help agencies fulfill their mission, but it can dramatically increase risks. New AI solutions might fail to work as intended; far worse, they could generate inaccurate information and direct employees or constituents to make decisions based on that information. The question for agencies is how to get widespread, bottom-up adoption while controlling for risks.

Our research on AI adoption in government suggests strategies that can help with this paradox: Giving the right access to the right workers with the right skills, and measuring mission outcomes to stay on track.

The right access for the right workers with the right skills

Unlike commercial enterprises in which leadership often drives AI scaling, line employees in government may be poised to unearth transformative AI use cases. But to find and develop those use cases, workers should have access to the appropriate tools and level of training for their roles and occupations. Access to gen AI continues to be an issue, with only 1% of government respondents in a recent Deloitte survey reporting that 60% of their workforce had access to gen AI tools.7 This is beginning to change, albeit slowly, as more governments give their workers wider access to gen AI tools.

Over a six-month period, the Digital Transformation Agency of Australia deployed an AI assistant to more than 7,600 employees across some 60 government agencies to assess gen AI’s impact on employee efficiency and productivity. The initiative aimed to understand the efficiencies that the tool could bring to daily work—and identify opportunities for more tailored solutions. In the initial pilot, the tool saved each employee an average of one hour per day on administrative tasks.8

While government access to tools continues to grow slowly, leaders can move to address the AI-fluency issue. Not every worker needs to know how to fine-tune a large language model or build their own chatbot. While a few users should have detailed knowledge on how to build AI tools, others can benefit from mid-level knowledge on how to select tools, while still others need only basic knowledge on how to use tools. Our research suggests that a build-choose-use paradigm for AI fluency varies with a worker’s occupation, level, and role. Occupations with higher exposure to gen AI—and higher potential exposure—should have more knowledge to take advantage of the technology’s availability. Similarly, managers likely need more AI fluency than entry-level workers who may just need to know how to use tools in a finite number of situations. Of course, those whose roles involve creating AI tools to enhance current processes and potential new ones should have more skill than those in technical- or end-user roles (figure 1).

These tiers of AI fluency can help the workforce gain the right level of AI knowledge for them. This can be important when setting expectations for what the experience of working with AI will be like. Expectations can be high for near-magical experiences, and one bad experience with AI can stall further experimentation for a whole organization. So making progress on AI depends on having the right fluency—both in terms of skills and expectations—for each individual.

Singapore’s experience illustrates the direct relationship that investing in workforce AI fluency can have on scaling AI. In December 2023, the Singapore government revised its 2019 AI strategy, introducing National AI Strategy 2.0, emphasizing reskilling and upskilling the workforce for an AI-driven future and building the necessary infrastructure to support a thriving AI industry.9 That focus on equipping the workforce with the right skills has been important in the nation’s efforts to scale AI from the bottom up. One example is the “AI Trailblazers” initiative, which aimed to develop 100 generative AI applications in 100 days, with both government entities and businesses gaining access to third-party toolkits.10

With the right access to tools and the right skills, government workers can find new opportunities for AI to help benefit the public at an unprecedented pace and scale.

Measuring mission outcomes

Even when good AI use cases emerge, constrained budgets can stall progress toward scaling. Our research suggests that while government leaders are as, or even more, eager to invest in AI than commercial counterparts, they struggle to measure the impact AI is having on their agencies (figure 2).

For AI projects to compete budget battles, they should have to have clear understanding of the cost benefit. AI-fluency training can help government leaders uncover the often-opaque costs associated with choosing and operating an AI model. But on the benefit side, government leaders often struggle. Unable to cite the quantitative metrics of sales and profit that commercial companies use, government leaders may need to work harder to define and measure mission impact. In the initial phases of AI adoption, when projects focus on increasing efficiency, metrics of time and dollars saved may help. But as agencies mature in their AI journey toward more impactful use cases, they could increasingly find a need for clear metrics for mission performance.

Having a solid understanding of the ways in which gen AI can create value for an agency is imperative: It can help indicate where to look for results and which metrics to choose to measure them. The pathway to value determines the types of metrics necessary to define success and even the relative levels of governance required.

  • AI automating work. When AI automates single tasks, it can create value by making the organization more efficient. Metrics such as time and dollars saved are the appropriate yardsticks of success.
  • AI augmenting work. When AI is incorporated into a larger workflow, it can create value by improving that workflow’s overall performance. In the commercial world, it can be easy to measure these performance benefits with metrics such as increased sales or decreased cash-conversion time; for government, it usually means finding metrics that get at mission outcomes such as decreasing crime, making benefits delivery efficient, or increasing community longevity.

These pathways to value can help with more than selecting metrics: they can also serve as signposts in selecting levels of AI governance. The more public impact that a gen AI use case has, the more validation and continuous oversight it needs. It’s not whether a gen AI use case involves direct public interaction since some internal processes such as benefits adjudication can have a major impact on constituents’ lives. So, understanding whether a use case is automating for efficiency or aiming to improve the outcome of the whole workflow can—when weighed alongside whether it is back-office or mission-focused—help determine how much oversight a particular use case needs.

Understanding how AI creates value can help select not only the right level of governance but which metrics can effectively measure success. For example, New Jersey’s state government has launched an AI implementation strategy, centered on the “NJ AI Assistant” platform. Because this AI assistant touches public services, it should have significant governance. To maintain security protocols and data sovereignty, employees operate it exclusively on state infrastructure.11 And because the AI assistant aims to improve mission effectiveness, leaders measure its success in terms of mission improvements. The New Jersey Department of Labor has seen a 35% acceleration in resident response rates through AI-enhanced communications, while the Division of Taxation’s AI-powered call center analysis has boosted successful call resolution by 50%.12

Tools and strategies for delivering on AI

To help reach scale of adoption that can bring such significant benefits, government leaders should have the required strategies and tools to execute them.

Strategy: Build platforms to rapidly scale benefit

  • Tools: AI platforms and marketplaces. A bottom-up path to scaling implies not only giving workers wide access to AI tools but having the ability to identify winning use cases and getting them in the hands of other workers as well. For some agencies, this could mean having an AI platform that brings together the technical capabilities needed to support widespread use of the technology with the organizational capabilities needed to screen and vet new AI solutions. The US Department of Defense’s Joint Common Foundation is one example of a large-scale AI platform, but the State Department has also adopted a smaller-scale AI marketplace to help develop and quickly scale new AI use cases.13

Strategy: Build expertise to make informed choices

  • Tools: Existing AI certificates from academia and government. Getting the right training in the hands of the right workers does not necessarily require a huge lift by learning and development staff. The Federal AI Institute and others in academia and government offer certificate-based programs that can quickly get workers to the right level of AI fluency.14

Strategy: Build with partners to manage risk

  • Tool: The expertise of technical partners. AI is a rapidly changing space, and trying to navigate it alone can be challenging. Tapping into the expertise of technical partners can be critical for agencies to lay technical foundations and, ironically, be an important step toward avoiding vendor lock-in. This can be a two-way street. Not only can technical partners help government leaders gain the fluency they need to make appropriate AI decisions, but government can also shape how the tech industry approaches problems, creating demand for solutions such as sovereign AI that can address security and privacy problems for government.15
  • Tool: The expertise of other government leaders. Government leaders face unique incentives, so where better to find advice that fits those incentives than from other government leaders? The expertise and advice shared in forums such as Chief AI Officer or CIO Councils can be instrumental in helping leaders navigate government’s unique hurdles such as budget and election cycles.

My take

Scaling AI by empowering people

Ms. Alexis Bonnell, chief information officer and director of the Digital Capabilities Directorate of the Air Force Research Laboratory, Department of the Air Force16

 

As the Air Force Research Laboratory (AFRL) chief information officer, I’ve learned that scaling AI is not just about the technology—it’s fundamentally about serving our people. Digital transformation is human transformation. The biggest issue we encounter at AFRL is the tendency to focus on the technology, the AI model, or the tool, while neglecting its purpose—to complement our incredible team. Our discussions must begin with people, thinking about how technology can enhance their capabilities.

 

When the AFRL introduced NIPRGPT, an experimental gen AI research platform, the focus was not only on experimenting and learning about the technology, security, and infrastructure but on human-machine teaming. Learning not only what people used AI for, but also what the adoption journey of a new technology looks like, and how they could help people find an aspirational vision of themselves as complemented by the technology—it was AI enabling you, not AI versus you.

 

The broader narrative around AI often paints it as overly technical and complex, inadvertently suggesting that people are not smart or ethical enough to handle it. Our approach to AI adoption and scaling starts from a point of trust. We understand that people are capable and have navigated every previous technological wave. This trust-first approach is fundamentally different. It really boils down to saying to people, “You’re enough. You’ve got this. We trust you.”

 

This approach to AI isn’t just about taglines; it changes how we approach the process of adopting AI. We have delineated the AI journey into four stages: “ta-da,” “uh-oh,” “ah-ha,” and “ho-hum.”

 

  1. The ta-da is about people experiencing gen AI for themselves. It requires access and the opportunity to experiment with new tools, encouraging exploration. NIPRGPT wasn’t about crafting the ultimate tool; it was about providing a safe, secure space for discovery. Observing how people engaged with AI allowed AFRL to share with commercial and government teams how to design better initial AI experiences and drive AI literacy.
  2. As people experience the ta-da moment, they move to the uh-oh—pondering AI’s relevance to their roles. They ask, “Where does this fit in my life? How is this related to my role? How do I use this well?” These questions are anchored in the reality of the technology. This phase is critical for people to work through fears and anxieties. If the uh-oh comes before the ta-da, all people can do is repeat other people’s fears and concerns. When it comes after the ta-da, they can actually explore their concerns by using the tool, which means they ask better questions that are more grounded in the reality of the technology and its role in their mission.
  3. As these queries evolve, they culminate in the ah-ha moment, where users perceive tangible benefits in their daily work, such as pain reduction in their everyday tasks. When they find that thing that they can do faster, or more easily, and get minutes on the mission back, the tool finally has a place in their lives and work.
  4. Ultimately, the goal is to reach the ho-hum stage, where the user experience becomes routine and there is comfort in the technology. As the CIO, my role is to help workers transition from being potentially intimidated by a technology to being comfortable and even bored by it as rapidly as possible. Because as soon as they reach being bored, they’re ready for the next thing. In national security, in a tech-enabled age, we can forget that having the best tech doesn’t matter if people don’t use it or aren’t comfortable with it. We capture and maintain strategic advantage when our people are able to adapt, demand, and leverage the best technology. The faster and more successfully we can drive adoption, the faster we can outpace the adversary.

 

At the ta-da stage, we make sure you have a tool and AI-101 training, which focuses on the “why AI, why now, and why you?” Then, in the uh-oh stage, we make sure they have access to experts and others like them who can discuss concerns or best approaches. A key element of helping people move quickly on this journey is role-based training. Once we get past the basics, in the ah-ha stage, we focus on what the tool can really do for them. AI is an incredible tool because of its intimacy—it will be used differently by each person. In role-based training, we tell people, “Here is what you can do with the tool.” For example, if you are in public affairs, here is what public affairs can do with the tool. A legal intelligence leader training will be different from manager training. In each of these role-based trainings, we answer, “What’s in it for me? And how do I get an advantage?” Once people see that others just like them are using the tool, it reframes the thinking from “This tool threatens my job,” to “Not using this tool threatens my promotion.” They understand that colleagues are using these tools and to survive and thrive, AI is a catalyst to their future opportunities—not a roadblock.

 

An important outcome of this approach is that when we start with people, we allow them to organize their relationship with the technology. This inverts the typical power dynamic of technology. When someone gets to use AI on their terms, they become true curators, with intimate knowledge of their role, their mission, and their information. They start to adapt their process, rearchitecting work around AI, driving change in their work better than any centralized executive could. In this way, they proactively optimize their mission, ultimately ensuring a better return on taxpayer dollars.

 

We are entering an era where every person is a technologist, and that is an incredible opportunity. As a CIO, my job is to figure out how to get people moving on their AI journey and embracing themselves and their human-machine teaming potential. The happiest day for me is when my users say, “What’s next?”

Show more

BY

Costi Perricos

United Kingdom

Edward Van Buren

United States

Vishal Kapur

United States

Joe Mariani

United States

Endnotes

  1. Joe Mariani, Pankaj Kishnani, and Ahmed Alibage, “Government’s less trodden path to scaling generative AI,” Deloitte, Oct. 24, 2024.

    View in Article
  2. Jim Rowan et al., “Now decides next: Moving from potential to performance—Deloitte’s State of Generative AI in the Enterprise: Quarter three report,” Deloitte, August 2024.

    View in Article
  3. Mariani et al., “Government’s less trodden path to scaling generative AI.”

    View in Article
  4. Buenos Aires Administration, “Boti reached 58 million conversations and continues to add services,” Jan. 31, 2023.

    View in Article
  5. Natalie Alms, “AI tools helped Treasury recover billions in fraud and improper payments,” Nextgov/FCW, Oct. 18, 2024.

    View in Article
  6. The White House, “Fact sheet: President Donald J. Trump protects America’s bank account against waste, fraud, and abuse,” March 25, 2025.

    View in Article
  7. Mariani et al., “Government’s less trodden path to scaling generative AI.”

    View in Article
  8. Australian Government Digital Transformation Agency, “Evaluation of whole-of-government trial into generative AI: Now available,” Oct. 23, 2024.

    View in Article
  9. Singapore Economic Development Board, “Singapore updates AI strategy with aim to contribute globally valuable breakthroughs,” Dec. 14, 2023.

    View in Article
  10. Smart Nation Singapore, “Launch of the AI Trailblazers initiative,” July 24, 2023.

    View in Article
  11. New Jersey Office of Innovation, “NJ AI Assistant,” April 1, 2025.

    View in Article
  12. State of New Jersey, “Governor Murphy unveils AI tool for state employees and training course for responsible use," press release, July 3, 2024.

    View in Article
  13. Alexandra Kelley, “State to develop new AI marketplace for staff,” Nextgov/FCW, Nov. 1, 2024.

    View in Article
  14. Rebecca Heilweil, “The federal government wants to teach workers about AI prompt engineering,” FedScoop, Nov. 1, 2024.

    View in Article
  15. Muath Alduhishy, “Sovereign AI: What it is, and 6 strategic pillars for achieving it,” World Economic Forum, April 25, 2024.

    View in Article
  16. The executive’s participation in this article is solely for educational purposes based on their knowledge of the subject, and the views expressed by them are solely their own. This article should not be deemed or construed to be for the purpose of soliciting business for any of the companies mentioned, nor does Deloitte advocate or endorse the services or products provided by these companies.

    View in Article

Acknowledgments

The authors would like to thank Sushumna Aggarwal, William D. Eggers, and Thirumalai Kannan for their research contributions and support with project management. In addition, the authors would like to thank Alexis Bonnell for her valuable input in the My Take section.

Cover image by: Sofia Sergi; Getty Images

Related Content