Life sciences and health care organizations appear at an inflection point with a unique opportunity to advance public/private information-sharing for a break-the-glass scenario that could save lives.
The pandemic is often credited with helping to accelerate change, challenge the status quo, and drive innovative research and development (R&D) solutions. Organizations the world over are innovating quickly, whether it is making ventilators (largely) from car parts,1 creating contact-tracing apps,2 or investing in R&D for public health.3 But the technology infrastructure required to support innovation and data strategies can be substantial. After all, upfront technology investment into on-premise infrastructure is heavy, procurement periods are long, and fixed costs are considerable—and together, these may become a barrier to progress. While many global organizations are embracing cloud infrastructure to support remote work,4 many may be missing out on optimizing their cloud and data strategies to enable innovation and R&D.
The cloud could be uniquely positioned to support R&D given that it can provision infrastructure almost instantaneously, scale up or down as the need shifts, and provide physical data center and virtual network security. What’s more, organizations can innovate secure digital applications and platforms more rapidly. Finally, the cloud brings a capacity to store and integrate information across robust networks. So, organizations can have interoperable data, and can engage in teamwork, collaboration, and cocreation.
To understand the potential of cloud technology for next-generation R&D, in September and October of 2020, we interviewed 10 specialists in business transformation, cloud, data engineering, and R&D whose input informed the insights of this article.
Based on our research, we found three distinct approaches related to cloud-enabled data, ecosystems, and services (figure 1).
While organizations across all industries can benefit, the opportunity may be especially relevant in the life sciences and health care (LSHC) industry. The global pandemic has highlighted the need for global coordination with omnichannel audiences across public, private, academia, and consortiums. This paper examines the different ways in which the cloud can enable R&D across the LSHC industry and beyond.
The technology to bring together real-world and clinical data, securely, at scale, and at velocity—the cloud—has been there all along. By urgent necessity, the pandemic has hypercharged this transformation for global collaboration around shared research objectives.
Deloitte’s research on radical data interoperability reveals that 60% of US LSHC organizations host more than half of their applications on the cloud already.9 However, nonstandardized data infrastructures pose a challenge in coordination10 and data interoperability within and across organizations. This is where cloud technologies can help advance transformation now and into the future.
The cloud is an enabler of data, and cloud and data modernization strategies are inextricably intertwined.11 As human genome sequencing data volumes grow to an expected 40 exabytes by 2025,12 and as scientists spend up to 30%–40% of their time searching for, aggregating, and cleansing data,13 the cloud may well be a force multiplier to get drugs to market faster and cheaper. In fact, it’s already set the world record14 for elastic analysis of genomic data.15 And Deloitte research shows that shared infrastructure and resources for master protocols can reduce the research cycle time by 13%–18% and overall cost savings of 12%–15%.16 The question becomes: How does cloud enable such efficiencies and cost savings?
Organizations are beginning to understand a centralized data warehouse isn’t the only model. There are numerous cloud data platform options, including enterprise data management, data exchanges, open architecture strategies, the centralized data warehouse, and data lakes each with their unique advantages.
It’s common for valuable laboratory data to be saved on local hard drives, thumb drives, and storage area networks, thus introducing storage capacity, searchability, and security challenges. The cloud gives organizations various solutions to ingest, transform, analyze and share millions of existing records with flexibility and at scale to make data a reusable asset across teams. Cloud-enabled enterprise data management platforms with a shared data lake are one common solution, and our interviews revealed LSHC organizations, in particular are beginning to explore a new, emerging operating model to manage data across the organization in the form of “data-sharing neighborhoods” to generate cross-domain insights and to share data with regulators.
The biotechnology company Biogen, for example, produced 50 images for a single sample per day, which were saved in local, electronic lab notes that were archived every six months. This created a data access and searchability challenge, which they addressed via the cloud.17 In another instance, Pfizer launched its Scientific Data Cloud to make research data shareable, customizable, and reusable for researchers, data scientists, software engineers, and operations. The platform was designed to enable automated scalable analysis for precision medicine to find individualized treatments for diseases like cancer, and it provides a foundation for a longer-term data marketplace.18
Cloud providers have started to offer data exchanges with clinical data, real-world evidence, and imaging data to power R&D studies and match drugs to patient populations.19 Over the next five years, cloud-enabled data exchanges and marketplaces could disintermediate data aggregators and resellers and provide new opportunities for large-scale and secure data transfers across organizations. These data marketplaces are expected to become increasingly important and could lead to a world where digital health data enables “learning health care,” with real-time clinical interventions that save lives.
The National Institutes of Health’s (NIH) All of Us Research Program will collect genomic data for 1 million people over 20 years for collaborative research.20 The UK National Health System’s Biobank generated 1.5PB+ of genetic, clinical, behavioral, and biometric data in its global population study.21 These projects are a concerted effort to create massive data repositories that can be used to support future internal or external data exchanges.
Merck, a leading global biopharmaceutical company, deployed an enterprisewide cloud-based real-world data (e.g., medical claims, EMR, etc.) analytics platform called the “Real World Data Exchange” to advance product development and commercialization. Merck’s Real World Data Exchange is an open, API-first platform and serves a broad set of stakeholders decreasing the time to insight-generation and fostering collaboration to positively impact all aspects of the product life cycle.22
It may be a challenge for commercial pharmaceutical companies to share proprietary data today, but many are sharing noncompetitive, HIPAA-compliant placebo data across studies without privacy concerns. They are doing so via an application programming interface (API)–first approach (a strategy which anticipates data-sharing across applications by design and allows for standardized, programmatic connection of applications). This approach creates a technical foundation for interoperable data-sharing as future collaboration incentives and cultural norms change and as an alternative to more open models (i.e. open APIs). The API-first organizations have frameworks that:
In health care, wearables are API-first technologies that allow data to be shared from devices to digital apps with privacy controls in place. Available in shareable and manageable formats, this data can be distributed across organizations for R&D purposes.
The cloud encrypts all data at rest and offers a variety of storage options for data with high input/output requirements. It also allows unique scale-up/scale-down capabilities. These advantages enable researchers to ingest petabytes of data at a given time and run queries by scaling up compute on-demand and temporarily, only paying for additional capacity when used. At the same time, the cloud offers dense/cold archive storage for long-term archival needs.
A vast majority (94%) of respondents believe real-world evidence in R&D will become increasingly important by 2022.23 A scalable cloud ecosystem would allow hospitals to share real-world data across their internal and external public/private networks to accelerate the entire ecosystem’s ability to target novel diseases or subpopulations who express major diseases differently. This is the cloud’s network effect in action—with just one important data set. To play that scenario forward, collaborative cloud ecosystems require the right operating model, such as a third-party arbiter, and incentives to ensure data safety, risk management, and IP protection—which is a journey still in progress to achieving these network benefits.
Research has shown that a network of ecosystems can help harness and accelerate distributed innovation for complex, externally-driven problems.24 It can facilitate a more collaborative approach and enables more diverse perspectives.25 Deloitte’s Transforming Clinical Development research has found some organizations are already experimenting with transformative approaches to drug development, such as use of real-world evidence and adaptive trials. Scaling the use of these approaches requires an ecosystem model wherein companies work collaboratively and transparently with multiple stakeholders. From a technology perspective, companies require interoperable data, knowledge management, and analytics platforms and processes, as well as scalable and secure cloud capabilities.26
From an R&D perspective, Deloitte’s 2020 Real World Evidence survey reveals that 16 out of 17 participating companies are using cloud platforms for real-world evidence and all the surveyed mature companies have a centralized, primarily cloud-based analytics platform.27 A scalable cloud ecosystem would allow hospitals to share real-world data across their internal and external public/private networks to accelerate their ability to target rare and novel diseases and cater to underrepresented populations. This could also open the door to cloud ML services to predict acute events like sepsis and understand disease states and linkages, such as congestive heart failure and diabetes.
The COVID-19 Healthcare Coalition, a private sector-led collaborative response to coronavirus, developed a cloud-native platform with secure/authenticated cloud storage, a data ingestion pipeline to sort/understand 300+ curated resources, and a big query searchable metadata repository for secure, scalable collaborative research. The platform has enabled members to support frontline responders and researchers and to improve treatment, regiments, vaccines, and device testing.28
However, there are challenges too. Take biomedical R&D, for example, where researchers lack coding expertise, and therefore the technology transformation ability. For this very reason, as the volume of genomics data has grown, most biomedical researcher organizations have embraced PaaS/SaaS data platforms powered by the cloud. But these point solutions have created data silos that can make it challenging to create clean and shareable data as part of a broader digital ecosystem. As organizations realize that data is an asset beyond a single analysis, this creates an opportunity for an API-enabled, cloud-native digital ecosystem.
Two of the top outcomes that life sciences companies are attempting to achieve with AI are enhancing existing products and creating new products and services. Cloud AI, ML, and the internet of things (IoT) services can provide greater innovation speed and agility across the R&D value chain. In fact, some organizations are exploring AI to better manage clinical trial data.29 Some are also using cloud AI services to coordinate and accelerate recruiting and matching patients with clinical trials sites,30 to analyze existing drugs on the market, to screen drugs for other diseases (such as COVID-19),31 and potentially to detect future disease outbreaks before they occur.32
By aiding physicians with real-time data-driven diagnosis and treatment plans, AI-based solutions can play an important role in streamlining clinical diagnostics.33 In cancer diagnostics, for example, both false positives and false negatives pose real challenges, but part of the problem with training AI/ML models for diagnostics is getting a large enough data sample set to detect lung and pancreatic cancer for example—typically found in stage 3—at stage 1. Public and private organizations are using cloud ML to improve accuracy of cancer diagnoses in private sector research34 and for early diagnosis.35
Bringing all three of these approaches together, the NIH has made progress with the NIH Data Commons to store, share, access, and interact with digital files generated from biomedical research.40 Its Accelerating COVID-19 Therapeutic Interventions and Vaccines partnership has brought together over a dozen biopharma companies to standardize collaborative frameworks across vaccine and therapeutic R&D from preclinical evaluation to immune-response testing.41 Through the NIH Science and Technology Research Infrastructure for Discovery, Experimentation, and Sustainability Initiative, the organization explores cloud and ML capabilities42 to generate, analyze, and share research data with commercial and 2,500 NIH-funded institutions.43 Most recently, the NIH National Center for Advancing Translational Sciences’ (NCATS’s) National COVID Cohort Collaborative Data Enclave (N3C) has created a centralized, secure, and cloud-enabled data platform to analyze real-world COVID-19 patient data for factors and long-term health consequences across 57 sites.44 For its part, the Rapid Acceleration of Diagnostics initiative aims to expedite innovation around rapid COVID-19 testing,45 undetected cases,46 and public health monitoring.47
Situation
As organizations responded to the pandemic, many have adapted to remote work. However, for Broadcom, a global infrastructure technology company that employs essential workers, engineers, and fabrication units, remote solutions were not an option. Faced with new workforce safety and risks challenges related to COVID-19 and the need to operate in a physical working environment, Broadcom wanted to go beyond standard social distancing, personal protective equipment (PPE), and hygiene, and sought to rapidly innovate a digital solution to alert and act on potential exposure–with employee privacy and local regulations in mind.
The cloud-native approach allowed the organization to quickly innovate a digital solution for data-sharing and analysis with worker privacy and data security considerations managed by the organization’s global privacy officer. The cloud application allowed Broadcom to:
A digital, cloud-data-sharing application was launched at scale in under 10 weeks across 10 countries for 15,000+ workers and 5,000 contractors that managed 350,000+ survey collections and work passes following daily symptom analysis and 250,000 automated workplace check-ins.
Source: Deloitte analysis.
Many organizations across industries are looking to modernize data platforms to reduce data costs, harness big data, create more data analysis flexibility, and tap into powerful artificial intelligence tools. Cloud technology could be the key enabler for them.48 Success may boil down to scalable and secure cloud data platforms to support interoperable data strategies, an ecosystem for collaborative analysis, and services to expedite and scale R&D innovation with low latency.
To give just one example of how these aspects are coming together in another industry, the US DoD— Defense Information Systems Agency Joint AI Center is looking to advance its intelligence strategy with a common, shared cloud-native/edge platform across 6–7 mission areas. It is expected to bring in multiple petabytes of data that would be impossible to move otherwise. The solution, referred to as the Joint Common Foundation (JCF) is designed to include high-level controls based on security clearance for various DoD organizations to access, buy, and acquire AI solutions and the data behind them. There is expected to also be defensive cyber operations including incident response, vulnerability management, continuous monitoring, and zero-trust architecture. Ultimately, the AI infrastructure could enable war fighters with secure data/tools for speed to decision that enable and enhance national security.
Going forward, there are five key takeaways to keep in mind (figure 2).
Cloud is more than a place, a journey, or a technology. It’s an opportunity to reimagine everything. It is the power to transform. It is a catalyst for continuous reinvention—and the pathway to help organizations confidently discover their possible and make it actual. Cloud is your pathway to possible.