Skip to main content

Scaling for success – enterprise adoption strategies for Gen-AI

Embracing the Generative AI Revolution in Financial Services Recommendations for Future Success (Part 3 of 5)

Generative AI (Gen-AI) has the potential to revolutionising industries, including financial services, but a key step towards realising its full potential is the ability of institutions to scale the right infrastructure to ensure Gen-AI can be adopted across the enterprise where needed. This article – the third in a series of five – explores the strategic roadmap for scaling Gen-AI, addressing issues from picking the right cloud strategy and managing unstructured data to empowering business users and balancing cost with performance, all informed by insights from the deployment of Deloitte’s own proprietary Gen-AI solution, PairD.

Fuelling AI innovation – cloud is the catalyst


Cloud computing has emerged as a critical enabler of Gen-AI, providing the scalability, flexibility, and cost-effectiveness required to power enterprise adoption. Its role in accelerating Gen-AI initiatives is central, particularly when considering the need for access to cutting-edge AI tools and infrastructure. 

In this context, cloud has a number of critical advantages over “on-prem” alternatives. These include:

  • Scalability: cloud services allow organisations to scale their AI infrastructure dynamically, adapting to changing demands without the need for heavy upfront investment.
  • Access to tools: cloud platforms offer integrated access to advanced AI tools and coding libraries, simplifying and accelerating the development and deployment of Gen-AI models.
  • Cost-effectiveness: crucially, cloud-based infrastructure also optimises resource use, minimising costs related to unused capacity in production. Organisations should consider hybrid strategies for experimentation, however, especially where dedicated on-prem GPU stacks are used.

For many institutions the choice is clear – by leveraging the cloud as a strategic partner, they can gain access to flexible and scalable infrastructure that aligns with their Gen-AI roadmaps, using their providers’ flexible resources to help manage increasing computational demand efficiently and effectively.

Bridging the gap with on-premise data for Gen-AI


However, for some organisations, full cloud adoption is not always possible due to security and privacy concerns. In such circumstances, on-premise solutions, particularly those involving large language models (LLMs), can provide a viable alternative for enterprises needing to maintain control over sensitive data.

For such institutions, a balance needs to be struck between the data needed for AI that can be held in the cloud and that requiring on-premise warehousing. Naturally, this decision needs to be carefully thought through in light of all the available options. For example, organisations can look to on-premise LLM solutions which allow them to reap the benefits of Gen-AI while simultaneously ensuring data sovereignty and addressing security concerns. Furthermore, on-premise AI solutions can help to mitigate risks related to data egress, offering greater control over where and how data is processed. The critical challenge, however, is to ensure that institutions can leverage Gen-AI securely while remaining fully compliant with data privacy regulations.

The question of how best to manage data was a key consideration in PairD’s deployment. Deloitte ensured our tool complied with data privacy laws, such as GDPR, while managing vast amounts of multi-modal data. To achieve the best outcome, we implemented comprehensive data strategies to ensure availability, compliance, and ethical considerations. This experience underlines the importance of focusing on data quality, residency, and compliance for all organisations as they scale their Gen-AI programmes, collaborating where needed across legal and security teams and ensuring the right data management strategies are in place to support their applications.

Solid foundations through agile and responsible adoption


A solid infrastructure foundation is critical for the agile and responsible deployment of Gen-AI. This foundation must address various factors, from data governance and security to continuous monitoring and cost control. Key elements include:

  • Solid data foundations: ensure the organisation has access to high-quality, relevant, and well-governed data with which to train its large language models (LLMs).
  • Security first: implement security measures that protect sensitive data throughout AI development and deployment.
  • Risk mitigation: identify and mitigate potential risks early in the implementation process, ensuring AI applications are safe and reliable when launched.
  • Continuous monitoring: monitor AI systems in real-time to track performance, fairness and compliance.
  • User feedback: establish feedback loops to gather input from users and continuously improve AI solutions.
  • Cost control: implement strategies to manage the costs associated with AI, including cloud resources, model training, and data storage.

Building a comprehensive infrastructure foundation will help to enable agile, responsible, cost-effective Gen-AI across the enterprise.

Facing the tricky challenges of availability, compliance and ethics in the age of Gen-AI


Gen-AI needs data, and its nature in today’s digitally abundant world is increasingly unstructured. Voice, video and image data, alongside traditional structured data, represent vast untapped potential that organisations must harness in order to unlock the full power of Gen-AI. Key considerations for managing such multi-modal data include:

  • Data capture: institutions require strategies to ensure the efficient capture and storage of multi-modal data across various sources.
  • Data cleansing: they must also ensure unstructured data is properly cleaned and organised to avoid so-called ‘garbage in, garbage out’ issues.
  • Data organisation: in addition, they will need to build the infrastructure necessary to handle the complexity and variety of multi-modal data required to fuel next-gen AI applications.

Hence, organisations embarking on their Gen-AI journeys must also invest in robust data infrastructure to help them manage their unstructured data holdings, ensuring they are high-quality, relevant and fully compliant with all pertinent regulations, including GDPR. For some organisations this alone will be no small feat. Against this potentially challenging backdrop, then, there are several data management issues for institutions to consider:

  • High-quality data: ensure that data is accurate, complete and free from bias to avoid skewed Gen-AI inferences. Data must also be relevant to the problem the solution aims to solve and organised in a way that facilitates efficient processing and analysis by the system’s algorithms.
  • Regulatory compliance: adhere to all relevant data privacy laws, ensuring that data residency, minimisation and security are principles built into every Gen-AI project.
  • Collaboration: foster early cooperation between data owners, legal teams and AI developers inside the organisation to address potential compliance and data availability issues early. Identify the location, sensitivity and regulatory requirements for all relevant datasets, with clear and secure processes for accessing and using data for Gen-AI purposes. Firms should also conduct thorough assessments to identify and mitigate potential privacy risks and regularly review data handling practices, adjusting as needed to maintain compliance with evolving regulations.

The rapid proliferation of Gen-AI solutions across various vendors and ‘hyperscalers’ presents both an opportunity and a challenge. While this diversity offers choice, it also creates complexity. For example, an institution might use a Gen-AI-powered chatbot from one vendor, a content generation tool from another and a data analytics platform from a third. Each solution might come loaded with its own security protocols, data governance policies and monitoring dashboards though, and this fragmentation can make it difficult to maintain a comprehensive view on how Gen-AI is being used in terms of load, as well as the extent to which security and ethics standards are being observed. 

A unified control plane for managing data, performance and security across various applications will help to ensure consistency and transparency. AI applications remain secure, transparent and aligned with business goals, with systems underpinned by a data strategy prioritising quality, compliance and ethical considerations. In this way, firms can be confident that their Gen-AI initiatives will align with their own standards as well as wider regulatory requirements.

Empowering the business through AI democratisation


The challenges here are complex. However, we believe that addressing the opportunities and challenges presented by Gen-AI should not be the sole preoccupation of data scientists and other AI experts. By democratising access to AI tools, organisations can empower business users to become active participants in the AI revolution, with so-called ‘no-code/low-code’ solutions playing a critical role in enabling non-technical users to leverage AI to innovate. Employees can be empowered in a number of ways:

  • No-code/low-code solutions: such tools allow business users to build AI-driven solutions through a user-friendly interface, without the need for advanced coding skills.
  • Rethinking creativity and innovation: democratising AI can further empower teams to develop creative solutions at the edge, unlocking new ideas and efficiencies for wider exploitation.

In this way, by prioritising Gen-AI solutions that offer intuitive, no-code interfaces, organisations can enable business users to drive innovation themselves, circumventing potential technical bottlenecks.

Addressing a complex technology landscape


Besides the foundational, process-driven and more collaborative questions surrounding the scaling of Gen-AI within an enterprise, there are numerous other challenges to be addressed, from how to balance innovation with deployment speed, to managing the complex and fragmented vendor ecosystem into which many of our clients are facing. In navigating these issues, institutions should consider the following factors:

  • Walking the ‘maturity tightrope’: in the fast-changing world of Gen-AI, things don’t stand still for long, and cloud vendors in particular are rapidly evolving their AI offerings. In the face of this dynamic market environment, organisations must ensure they remain flexible and adaptive in their decision-making.
  • Security versus Agility: striking a balance between risk mitigation and innovation is critical to ensuring security without stifling progress. Firms must therefore consider how best to foster valuable innovation, where such activity should be sited and how it should be funded to ensure projects can demonstrate those necessary early wins and deliver their full anticipated value.
  • A fragmented ecosystem: with multiple vendors providing AI solutions, organisations must possess flexible infrastructure and an up-to-date view of the vendor landscape in order to optimise their integrations.

Adopting an agile deployment strategy that allows for rapid experimentation while maintaining a strong governance framework for AI usage and compliance across all vendors will position organisations to achieve greater success with Gen-AI.

Addressing the critical non-functional requirements of Gen-AI applications


While functional requirements define what Gen-AI should do, non-functional requirements address how it should operate, particularly around security, performance, and regulatory compliance. In this context, there are a number of non-functional requirements for decision-makers to consider, including:

  • Regulatory compliance: it is important to build transparency into Gen-AI applications. This can be achieved in a number of ways, including by informing users when they are interacting with an AI as well as by ensuring human oversight is provided where needed.
  • Iterative feedback: Continuously gathering feedback to refine applications can also ensure systems operate at their best, allowing organisations to adapt their solutions to evolving user needs and a fast-changing regulatory environment.
  • Scope control: in tending to these important issues, it is essential for organisations to avoid scope creep. By carefully managing the process for adding new features, additions that could potentially complicate compliance, or the user experience itself, can be better avoided.

In essence, we believe that institutions must ensure that non-functional requirements – including those of security, performance, and compliance – continue to be considered at each and every stage of Gen-AI development as capabilities are scaled. A leading practice approach to Gen-AI will deliver better results and, importantly, also help firms to control their costs by eliminating costly missteps.

Optimising Gen-AI for the enterprise – a pragmatic approach to managing cost and performance


Scaling Gen-AI can also strain resources and budgets, making it essential for organisations to take a pragmatic approach to optimisation. By balancing cost-effectiveness with performance, enterprises can unlock the full potential of Gen-AI without overextending their resources. There are several important elements to consider in terms of managing cost efficiency:

  • Resource optimisation: firms should continuously evaluate their infrastructure so they can ‘right-size’ cloud resources on-the-fly, minimising costs further through technologies like serverless computing2.
  • Private vs. public inference: organisations should also carefully weigh the relative costs and benefits of hosting their Gen-AI models on public cloud-based platforms (i.e., public inference) or instead run them on their own local machine or private servers (i.e., private inference). Such a decision should be based on workload predictability and performance needs. Both approaches have their pros and cons, though, which need to be fully understood in order to optimise cost.
  • Data quality remediation: institutions implementing Gen-AI also need to invest in data quality tools to prevent costly remediation efforts resulting from poor data, ensuring their inferences (i.e., model outputs) are as accurate and reliable as possible.

To achieve peak cost efficiency, organisations should implement a resource optimisation strategy that balances cost and performance to ensure their Gen-AI solutions are both scalable and sustainable, optimising their technology stacks and leveraging modular Gen-AI architectures to help them adjust performance and costs according to specific use cases. At Deloitte, the cost-effective deployment of PairD demonstrated precisely how scalability does not have to come at the expense of budget. Indeed, by leveraging modular architectures and optimising our own technology stack, PairD has been able to deliver high performance at a lower cost than many comparable market alternatives.

In conclusion, scaling Gen-AI across an enterprise requires a thoughtful and strategic approach, one that addresses both the infrastructure and data challenges of AI adoption. By leveraging cloud solutions, securing data, empowering business users and optimising for cost and performance, organisations can unlock the full potential of Gen-AI. Backed by solid infrastructure foundations, financial institutions will then be best placed to harness Gen-AI to drive innovation, agility, and sustainable growth. And, in our next article we look at another key decision point for institutions – that of whether to train their own LLMs or use ‘out of the box’ models.

___________________________________________________________________________________

References:

1 Developed by Deloitte’s AI Institute, PairD is an internal Generative AI platform designed to help the firm’s people with day-to-day tasks, including drafting content, writing code and carrying out research safely and securely. The tool is also able to create project plans, give project management best practice advice and suggest task prioritisation.

2 Serverless computing does not mean “computing without servers”. Rather, it refers to a model of application development and execution where developers are enabled to build and run their application code without the requirement to provision or manage servers or any other back-end infrastructure.