Deloitte Consulting LLP (Deloitte) is activating a next-generation AI supercomputing platform in collaboration with NVIDIA as part of Deloitte’s Silicon2Service AI offering, intended to power Deloitte’s AI factory. Designed for inference at scale and the accelerating demands of agentic AI, the platform expands Deloitte’s ability to help enterprises move faster from prototypes to secure, production-ready AI—while also helping them improve performance, cost efficiency, and operational resilience. Deloitte’s Silicon2Service team will help bring NVIDIA’s Nemotron 3 family of models—including Nemotron 3 Super—to life across Deloitte and its clients to accelerate agentic AI adoption. Designed for multi-agent orchestration with robust tool calling and instruction following, Nemotron 3 supports Deloitte’s tokenomics approach to maximize token efficiency and increase the value realized from modern GPU infrastructure.
As enterprises mature and increase the complexity of their AI deployments, the shift to inference at scale can drive large increases in token generation (and therefore cost). This makes it increasingly important to review enterprise AI infrastructure strategy. Accordingly, Deloitte is making a significant investment to expand its hybrid AI infrastructure capabilities, further equipping its teams to build, validate, and operate AI systems under real-world enterprise constraints. This can assist in faster time-to-market for Deloitte’s own solutions and a richer asset catalog to bring to clients, helping to accelerate the adoption or growth of their own AI factories.
Deloitte is taking an active leadership role in building enterprise AI, architecting infrastructure not just for today’s demand, but also for what comes next. In order to do so, Deloitte plans to expand its AI infrastructure with the power of an NVIDIA GB300 NVL72 system and systems with NVIDIA RTX PRO 6000 Blackwell Server Edition, creating a foundation designed to scale as AI adoption accelerates. This effort is being executed in collaboration with HPE and Dell Technologies.
Deloitte will use this AI factory infrastructure to increase the pace of development for its AI platform and physical AI solutions while providing a path for potentially greater and more efficient expansion of its growing internal AI initiatives. Deloitte will also leverage these new systems for ongoing training and upskilling to enable it to continue to meet the dynamic needs of clients.
The platform is intended to help Deloitte:
"The rise of agentic AI and the shift to inference at scale are accelerating enterprise demand for secure, production-ready AI. By expanding its AI factory infrastructure with NVIDIA GB300 NVL72, NVIDIA RTX PRO 6000 Blackwell Server Edition, and NVIDIA Nemotron 3, Deloitte is positioned to help organizations securely build, scale, and deploy enterprise-grade AI solutions."
— John Fanelli, Vice President EnterpriseSoftware, NVIDIA
"Enterprise AI is moving to inference at scale and that changes everything about performance, cost, and operating discipline. We’re activating a next-generation AI factory infrastructure with NVIDIA and our ecosystem of alliances so we can build and scale AI workloads and physical AI solutions faster for our clients."
— Nicholas Merizzi, principal, Deloitte Consulting LLP.
The AI factory is an infrastructure and services foundation—not a single application. It consists of accelerated compute, high-performance networking, and high-speed storage, paired with Deloitte’s software management and service layer and is designed to help organizations improve system efficiency, user interaction, and overall utility.
At a high level, the refreshed AI Factory as a Service includes:
Learn more: Deloitte’s AI Factory as a Service, powered by NVIDIA