AI transparency and reliability in finance and accounting: Understanding how AI can impact financial audits

Author:

Ryan Hittner

Contributers: John Fogarty, Audit & Assurance Senior Manager, Deloitte & Touche LLP; and Morgan Dove, Audit & Assurance Senior Manager, Deloitte & Touche LLP

Talking points

Like other financial information, AI-generated outputs must meet stakeholder expectations for accuracy, reliability, and trustworthiness.
Achieving this goal involves understanding how AI systems can impact financial audits and implementing effective AI governance and controls.
Important elements include human oversight, a reliable audit trail, and thorough testing and ongoing monitoring.

Artificial intelligence (AI) is transforming finance and accounting at an astonishing rate. Over the past year and a half, The Pulse has chronicled how finance organizations are rapidly leveraging both traditional and Generative AI (GenAI) to enhance a wide range of finance applications, from SOX compliance to controllership functions like financial close and reporting.

One element connecting all these applications is the financial audit process. When implementing AI for finance and accounting functions, it’s important to consider how it may impact both internal and external audits and to adapt your AI governance and oversight procedures accordingly. We recently explored this topic in our presentation “AI in Accounting and Financial Reporting: How Might IT Impact Your Audit?” at the Financial Executives International (FEI) AI conference. Here are some of the takeaways and insights from that session.

The crucial role of AI controls

We’ve all heard about AI’s risks. For finance and accounting functions, these risks typically center on the accuracy of AI-generated outputs and the transparency (organizational awareness) of AI-enabled processes. The global finance system relies on accurate and reliable information from finance and accounting. These teams must have confidence in the data they receive from AI applications and must also demonstrate why their trust is justified—no easy feat. However, there are practical steps that organizations can take to maintain trust and accuracy in their financial reporting processes.

Here are some of the steps around governance procedures and controls that can help organizations support accuracy and reliability of AI-generated financial data:

Human oversight and transparency: Maintaining human review in the AI lifecycle where possible and being transparent with stakeholders about where and how AI is being used are two pillars of reliability and trustworthiness.
Data management and audit trail maintenance: Implementing data quality controls is vital for ensuring data used by AI models is accurate, relevant, and reliable. Archiving both inputs and outputs and managing any changes to the AI models or their data sets, can help maintain a thorough audit trail.
Testing and ongoing monitoring: Implementing robust development, validation, and ongoing monitoring demonstrates to internal and external stakeholders that management is monitoring and addressing risks associated with the use of AI models.
Documentation and reporting: These procedures not only enable transparency and traceability; they provide a clear record of decisions made around AI models.

Let’s dive a little deeper into each of these areas.

Strategic and efficient human oversight

Human review is essential for AI systems and thoroughly evaluating AI outputs should be standard practice. However, if an AI system consistently proves reliable in initial testing and over time, it may be possible to balance hands-on human oversight with smart, automated monitoring. This can involve setting usage limits and automated checks to flag unusual activity and alert your team when needed. Periodically sampling AI actions can help ensure reliability without overwhelming your team. This approach can maximize AI’s speed and efficiency while maintaining high accuracy and quality.

Data quality practices

The performance and reliability of an AI model hinge on the quality and accuracy of the data you feed into it. Effective AI data management practices are essential to verify data quality and maintain an audit trail for AI transparency. Leading data management practices include:

Gauging performance

Robust testing and monitoring are important for finance and accounting functions. Continuous testing and monitoring of AI systems can help ensure AI outputs meet prescribed accuracy, reliability, and compliance standards. Testing metrics and methods vary by use case and model type, but comparing outputs against expected results can help identify inaccuracies. Organizations working with AI vendors should understand and potentially augment the vendor's testing and monitoring processes.

Case in point

Let’s look at how governance and controls work in practice to achieve accurate AI outputs. In this example, a tech company’s finance team uses a computer vision model and a large language model (LLM) to automate its invoice classification process and evaluate process performance afterwards. The AI process includes these steps:

The computer vision model first extracts features and text from the invoices. The LLM then classifies the extracted text into predefined categories (e.g., utilities, office supplies, travel expenses).
The company evaluates the performance of its invoice classification process by testing both key components: feature extraction and classification.
The finance team first evaluates the feature extraction component of the invoice classification process by calculating metrics like character error rate (CER), word error rate (WER), and Levenshtein distance, which measures the minimum number of single-character edits required to change one word into another, using a large testing dataset.
The finance team also explores other novel methods for evaluating feature extraction, like semantic similarity score (SemScore), which can help the team evaluate how similar or dissimilar the selected invoice is from other similarly categorized invoices.
Finally, the company evaluates the performance of the LLM in classifying the invoices correctly, using metrics such as accuracy, precision, recall, and F1 score, which measures the harmonic mean of precision and recall.

This case study shows that the testing methods and metrics needed to comprehensively assess the model's performance were considered and measured. These testing results can demonstrate to risk managers, regulators, auditors, and other stakeholders just how management has gotten comfortable with the performance for the particular use case and considered an acceptable error rate.

Along with rigorous testing and ongoing monitoring of the model’s performance, the company’s finance team has put several other important controls in place. They’re keeping detailed records of all their training, testing, and unlabeled data sets, and have also implemented data versioning and access controls to increase data integrity. Their models are fully documented, covering everything from key decisions and how the model works to known limitations, controls, and any updates made along the way. Finally, they’re transparent with stakeholders about exactly where, how, and when AI is used and, just as importantly, where human oversight still plays a role in the process.

What role can Deloitte play?

The importance of effective AI governance, oversight, and controls can’t be overstated. They are the foundation of transparency, accuracy, and reliability for AI finance systems and data. Deloitte can advise you on how to harness AI responsibly using safeguards like Deloitte’s Trustworthy AI™ framework. For more information, watch this video or visit our website.

The services described herein are illustrative in nature and are intended to demonstrate our experience and capabilities in these areas; however, due to independence restrictions that may apply to audit clients (including affiliates) of Deloitte & Touche LLP, we may be unable to provide certain services based on individual facts and circumstances.

This publication contains general information only and Deloitte is not, by means of this publication, rendering accounting, business, financial, investment, legal, tax, or other professional advice or services. This publication is not a substitute for such professional advice or services, nor should it be used as a basis for any decision or action that may affect your business. Before making any decision or taking any action that may affect your business, you should consult a qualified professional advisor. Deloitte shall not be responsible for any loss sustained by any person who relies on this publication.

Ryan Hittner

United States

Audit & Assurance Principal

Ryan is an Audit & Assurance principal with more than 15 years of management consulting experience, specializing in strategic advisory to global financial institutions focusing on banking and capital markets. Ryan co-leads Deloitte's Artificial Intelligence & Algorithmic practice which is dedicated to advising clients in developing and deploying responsible AI including risk frameworks, governance, and controls related to Artificial Intelligence (“AI”) and advanced algorithms. Ryan also serves as deputy leader of Deloitte's Valuation & Analytics practice, a global network of seasoned industry professionals with experience encompassing a wide range of traded financial instruments, data analytics and modeling. In his role, Ryan leads Deloitte's Omnia DNAV Derivatives technologies, which incorporate automation, machine learning, and large datasets. Ryan previously served as a leader in Deloitte’s Model Risk Management (“MRM”) practice and has extensive experience providing a wide range of model risk management services to financial services institutions, including model development, model validation, technology, and quantitative risk management. He specializes in quantitative advisory focusing on various asset class and risk domains such as AI and algorithmic risk, model risk management, liquidity risk, interest rate risk, market risk and credit risk. He serves his clients as a trusted service provider to the CEO, CFO, and CRO in solving problems related to risk management and financial risk management issues. Additionally, Ryan has worked with several of the top 10 US financial institutions leading quantitative teams that address complex risk management programs, typically involving process reengineering. Ryan also leads Deloitte’s initiatives focusing on ModelOps and cloud-based solutions, driving automation and efficiency within the model / algorithm lifecycle. Ryan received a BA in Computer Science and a BA in Mathematics & Economics from Lafayette College. Media highlights and perspectives First Bias Audit Law Starts to Set Stage for Trustworthy AI, August 11, 2023 – In this article, Ryan was interviewed by the Wall Street Journal, Risk and Compliance Journal about the New York City Law 144-21 that went into effect on July 5, 2023. Perspective on New York City local law 144-21 and preparation for bias audits, June 2023 – In this article, Ryan and other contributors share the new rules that are coming for use of AI and other algorithms for hiring and other employment decisions in New York City. Road to Next, June 13, 2023 – In the June edition, Ryan sat down with Pitchbook to discuss the current state of AI in business and the factors shaping the next wave of workforce innovation.

Read full bio