Generative AI (Gen-AI) has the potential to revolutionising industries, including financial services, but a key step towards realising its full potential is the ability of institutions to scale the right infrastructure to ensure Gen-AI can be adopted across the enterprise where needed. This article – the third in a series of five – explores the strategic roadmap for scaling Gen-AI, addressing issues from picking the right cloud strategy and managing unstructured data to empowering business users and balancing cost with performance, all informed by insights from the deployment of Deloitte’s own proprietary Gen-AI solution, PairD1 .
Cloud computing has emerged as a critical enabler of Gen-AI, providing the scalability, flexibility, and cost-effectiveness required to power enterprise adoption. Its role in accelerating Gen-AI initiatives is central, particularly when considering the need for access to cutting-edge AI tools and infrastructure.
In this context, cloud has a number of critical advantages over “on-prem” alternatives. These include:
For many institutions the choice is clear – by leveraging the cloud as a strategic partner, they can gain access to flexible and scalable infrastructure that aligns with their Gen-AI roadmaps, using their providers’ flexible resources to help manage increasing computational demand efficiently and effectively.
However, for some organisations, full cloud adoption is not always possible due to security and privacy concerns. In such circumstances, on-premise solutions, particularly those involving large language models (LLMs), can provide a viable alternative for enterprises needing to maintain control over sensitive data.
For such institutions, a balance needs to be struck between the data needed for AI that can be held in the cloud and that requiring on-premise warehousing. Naturally, this decision needs to be carefully thought through in light of all the available options. For example, organisations can look to on-premise LLM solutions which allow them to reap the benefits of Gen-AI while simultaneously ensuring data sovereignty and addressing security concerns. Furthermore, on-premise AI solutions can help to mitigate risks related to data egress, offering greater control over where and how data is processed. The critical challenge, however, is to ensure that institutions can leverage Gen-AI securely while remaining fully compliant with data privacy regulations.
The question of how best to manage data was a key consideration in PairD’s deployment. Deloitte ensured our tool complied with data privacy laws, such as GDPR, while managing vast amounts of multi-modal data. To achieve the best outcome, we implemented comprehensive data strategies to ensure availability, compliance, and ethical considerations. This experience underlines the importance of focusing on data quality, residency, and compliance for all organisations as they scale their Gen-AI programmes, collaborating where needed across legal and security teams and ensuring the right data management strategies are in place to support their applications.
A solid infrastructure foundation is critical for the agile and responsible deployment of Gen-AI. This foundation must address various factors, from data governance and security to continuous monitoring and cost control. Key elements include:
Building a comprehensive infrastructure foundation will help to enable agile, responsible, cost-effective Gen-AI across the enterprise.
Gen-AI needs data, and its nature in today’s digitally abundant world is increasingly unstructured. Voice, video and image data, alongside traditional structured data, represent vast untapped potential that organisations must harness in order to unlock the full power of Gen-AI. Key considerations for managing such multi-modal data include:
Hence, organisations embarking on their Gen-AI journeys must also invest in robust data infrastructure to help them manage their unstructured data holdings, ensuring they are high-quality, relevant and fully compliant with all pertinent regulations, including GDPR. For some organisations this alone will be no small feat. Against this potentially challenging backdrop, then, there are several data management issues for institutions to consider:
The rapid proliferation of Gen-AI solutions across various vendors and ‘hyperscalers’ presents both an opportunity and a challenge. While this diversity offers choice, it also creates complexity. For example, an institution might use a Gen-AI-powered chatbot from one vendor, a content generation tool from another and a data analytics platform from a third. Each solution might come loaded with its own security protocols, data governance policies and monitoring dashboards though, and this fragmentation can make it difficult to maintain a comprehensive view on how Gen-AI is being used in terms of load, as well as the extent to which security and ethics standards are being observed.
A unified control plane for managing data, performance and security across various applications will help to ensure consistency and transparency. AI applications remain secure, transparent and aligned with business goals, with systems underpinned by a data strategy prioritising quality, compliance and ethical considerations. In this way, firms can be confident that their Gen-AI initiatives will align with their own standards as well as wider regulatory requirements.
The challenges here are complex. However, we believe that addressing the opportunities and challenges presented by Gen-AI should not be the sole preoccupation of data scientists and other AI experts. By democratising access to AI tools, organisations can empower business users to become active participants in the AI revolution, with so-called ‘no-code/low-code’ solutions playing a critical role in enabling non-technical users to leverage AI to innovate. Employees can be empowered in a number of ways:
In this way, by prioritising Gen-AI solutions that offer intuitive, no-code interfaces, organisations can enable business users to drive innovation themselves, circumventing potential technical bottlenecks.
Besides the foundational, process-driven and more collaborative questions surrounding the scaling of Gen-AI within an enterprise, there are numerous other challenges to be addressed, from how to balance innovation with deployment speed, to managing the complex and fragmented vendor ecosystem into which many of our clients are facing. In navigating these issues, institutions should consider the following factors:
Adopting an agile deployment strategy that allows for rapid experimentation while maintaining a strong governance framework for AI usage and compliance across all vendors will position organisations to achieve greater success with Gen-AI.
While functional requirements define what Gen-AI should do, non-functional requirements address how it should operate, particularly around security, performance, and regulatory compliance. In this context, there are a number of non-functional requirements for decision-makers to consider, including:
In essence, we believe that institutions must ensure that non-functional requirements – including those of security, performance, and compliance – continue to be considered at each and every stage of Gen-AI development as capabilities are scaled. A leading practice approach to Gen-AI will deliver better results and, importantly, also help firms to control their costs by eliminating costly missteps.
Scaling Gen-AI can also strain resources and budgets, making it essential for organisations to take a pragmatic approach to optimisation. By balancing cost-effectiveness with performance, enterprises can unlock the full potential of Gen-AI without overextending their resources. There are several important elements to consider in terms of managing cost efficiency:
To achieve peak cost efficiency, organisations should implement a resource optimisation strategy that balances cost and performance to ensure their Gen-AI solutions are both scalable and sustainable, optimising their technology stacks and leveraging modular Gen-AI architectures to help them adjust performance and costs according to specific use cases. At Deloitte, the cost-effective deployment of PairD demonstrated precisely how scalability does not have to come at the expense of budget. Indeed, by leveraging modular architectures and optimising our own technology stack, PairD has been able to deliver high performance at a lower cost than many comparable market alternatives.
In conclusion, scaling Gen-AI across an enterprise requires a thoughtful and strategic approach, one that addresses both the infrastructure and data challenges of AI adoption. By leveraging cloud solutions, securing data, empowering business users and optimising for cost and performance, organisations can unlock the full potential of Gen-AI. Backed by solid infrastructure foundations, financial institutions will then be best placed to harness Gen-AI to drive innovation, agility, and sustainable growth. And, in our next article we look at another key decision point for institutions – that of whether to train their own LLMs or use ‘out of the box’ models.
___________________________________________________________________________________
References:
1 Developed by Deloitte’s AI Institute, PairD is an internal Generative AI platform designed to help the firm’s people with day-to-day tasks, including drafting content, writing code and carrying out research safely and securely. The tool is also able to create project plans, give project management best practice advice and suggest task prioritisation.
2 Serverless computing does not mean “computing without servers”. Rather, it refers to a model of application development and execution where developers are enabled to build and run their application code without the requirement to provision or manage servers or any other back-end infrastructure.