Using multimodal real-world data for decision-making

Learn more about Gilead Sciences’ efforts to drive biomarker innovations and assemble their organization for integrated, data-driven research.
Authors Ryan Fukushima
COO, Tempus

Scott Patterson, PhD
Vice President of Biomarker Sciences, Gilead Sciences, Inc.

Sahed Iqbal, PhD
Senior Director and Head of Molecular Epidemiology, Biomarker Sciences, Gilead Sciences, Inc.

This article is part of our series on strengthening clinical trial design through the use of multimodal real-world data. Read our first Q&A with Iker Huerga, Senior Vice President of RWD at Tempus here, and stay tuned for the next article in which we explore how our collaborators are using a data-first approach to clinical trials to bring treatments to patients faster and at a lower cost.

The use of multimodal real-world data (RWD) creates a wealth of opportunity for researchers – it can confirm existing knowledge, uncover new hypotheses about tumor biology, and de-risk future efforts to increase the probability of success of a clinical trial. But, in order to truly harness the power of multimodal RWD and build successful development strategies, a few elements can greatly impact a program’s success: demonstrating the value of data, educating potential collaborators on its ability to enhance their work, and incorporating the right internal expertise to communicate with stakeholders across disciplines.

Ryan Fukushima, COO of Tempus, recently sat down with Scott Patterson, PhD, Vice President of Biomarker Sciences, and Shahed Iqbal, PhD, Senior Director and Head of Molecular Epidemiology, Biomarker Sciences at Gilead Sciences, Inc, to discuss the ways in which multimodal datasets can be used to drive biomarker innovations, and how they have assembled their organization to set the standard for integrated, data-driven research. We’ve summarized their discussion below.

Responses have been edited for clarity and length.

Ryan: Thanks for joining me today, Scott and Shahed. Before we jump in, could you briefly introduce yourselves, your roles, and describe the Biomarker Sciences group within Gilead?

Scott: I’m really thrilled to be part of this discussion today. In my role at Gilead, I lead the Biomarker Sciences group, over 60 staff members responsible for biomarkers across all therapeutic areas. We are part of the development organization at Gilead, but we have very close ties with research. We support clinical development through a three-pronged biomarker strategy.

One, evaluating the target engagement. Has the drug interdicted the target in the manner we anticipate? And through pharmacodynamic (PD) biomarker assessment, we can also, alongside our clinical pharmacology colleagues, use findings to support dosing schedule estimations.

Two, interrogating the impact of disease-associated biomarkers to evaluate the mechanism of action and combination potential. We use biomarkers to reflect dysregulated pathways, and we look for early reads on disease modifying activities and reversion of some of those dysregulated pathways to normal. When we see pathways not impacted, this gives us an idea of what other drugs might be able to be used in combination.

Three, examining the heterogeneity of the clinically defined patient population as we identify baseline, or even on-treatment biomarkers, that show us which patients may or may not benefit from our therapies.

Our Biomarker Sciences group truly spans R&D and commercial as we engage with programs two to three years before they enter the clinic, and then through all stages of development into post-marketing. At the heart of our mission is the desire to better understand human pathobiology. We strive to stay informed of new technologies and approaches that could help us answer questions posed in our biomarker strategy and that included building a new domain of expertise, a molecular epidemiology function.

Shahed: In my role, I lead the molecular epidemiology function within the Biomarker Sciences group at Gilead, supporting all of the strategies that Scott just described. For our work, we use multimodal clinico-genomic data as a primary tool, including longitudinal patient-level clinical information, as well as DNA mutation data and RNA-sequencing data. Our work spans all translational activities, and we help weigh prognostic and predictive biomarkers, both novel and known, by disease indications.

Ryan: Could you describe how you designed the Biomarker Sciences group to pursue innovative biomarker strategies?

Scott: While our biomarker strategies are aligned by therapeutic area, we also encourage working across therapeutic areas in areas of common interest. For example, at times, we may want to modulate the immune system up or down, depending on the indication – whether it be in oncology, inflammation, fibrosis, etc.

Our Biomarker leads, aligned by TA, constitute more than half of the department and are accountable for the development of the Biomarker Strategy in concert with our Research and Clinical colleagues. To support biomarker discovery and the need for rapid assay prototyping and development across therapeutic areas, about a quarter of the team is a lab-based group that are experts across all assay formats. We also have a biobanking sample management group, to handle sample procurement, and the use of clinical samples in support of these efforts. We established a dedicated in vitro diagnostics (IVD) group for biomarkers that reveal themselves as diagnostic candidates, which can occur at any stage through the development process. And finally, we have our molecular epidemiology function, led by Shahed, to support all therapeutic areas by providing expertise to successfully deploy RWD and multimodal clinico-genomic datasets, in support of both our biomarker and translational efforts. RWD expertise was a capability that I had wanted to add to the department for some time.


Ryan: Could you walk us through some examples of how your group has leveraged having the right teams in the right places to leverage multimodal RWD for biomarker discovery?

Scott: It’s important to remember that this is a nascent area, and we’re all learning. We’ve had a great time learning with the folks at Tempus.

Clinical trials provide the ability to observe outcomes in very controlled settings. But the estimate of patients who participate in clinical trials is about 4 percent — so what about the other 96 percent? By leveraging RWD, we can include treatment information, lab data, and outcomes together with multimodal data to explore a broad patient population, including indications not yet explored in clinical development. There’s an opportunity to expand beyond what can be done within the clinical trial arena.

Even with data from clinical trials, access to additional real-world datasets is advantageous. Having worked on the translational side for 20 years and, in particular, seeking to identify biomarkers that can predict response or non-response to a therapeutic regimen, and despite generating biomarker hypotheses from well-controlled trials, there are always unknown unknowns. Having access to larger, independent datasets is critical to de-risk a hypothesis prior to a prospective trial, which hasn’t been easy in the past.

There are few large genomic DNA-based datasets that also include extensive transcript data and clinical lab data. This is a key element of our strategy. How does a patient whose tumor has a specific profile respond to a therapeutic regimen? If there is a response, what’s the duration?

Our first major foray into this field was assembling a cohort in advanced non-small cell lung cancer (aNSCLC) to compare checkpoint inhibitor therapy, checkpoint inhibitor plus chemo, or chemo-alone cohorts to test patient stratification hypotheses. We examined specific target expression and inferred pathway activity from the transcript data, together with DNA profiling. Our initial aim was a pragmatic approach to explore the molecular profiles of tumors that display primary resistance to a specific therapeutic regimen.

Time to next treatment and time-to-treatment discontinuation are valuable markers of primary resistance. Access to Tempus’ dataset has allowed us to test published stratification hypotheses, both DNA- and RNA-based. Results around signatures, such as tumor mutational burden, CD8 infiltration, looking at interferon gamma signatures, as well as PD-1 and PD-L1 expression for those genes, confirmed the validity of the dataset, and allowed us to see how our current range of targets measured up.

We chose aNSCLC due to its importance as an indication in oncology, but also, at the time, we had no internal studies from which to explore these hypotheses and our exploratory targets. Procuring a large sample set with the appropriate outcomes data was going to take too long, and would be expensive. Our initial foray into a reasonably-sized real-world, multi-omics dataset was a learning exercise for us. But it was critical to establish the trust and use of RWD by our key stakeholders.

Shahed: As Scott described, our initial intention was to compare the molecular profiles of tumors and how patients responded to specific therapies, namely immune checkpoint inhibitor (ICI) monotherapy, chemotherapy, and ICI-chemo combination therapy. We considered biopsy timing and initiation of treatment to arrive at what we thought would be an etiologically relevant window. It required a hybrid approach – one that was data-driven, but also clinically meaningful and appropriate. We brought together a multidisciplinary team of practicing oncologists, molecular biologists, biomarker scientists, bioinformatics experts, and epidemiologists. In our experience, this multidisciplinary approach was the most effective way to glean insights from this real-world clinico-genomic data. Understandably, not all the team members had equal familiarity with RWD. But along the way, those who were new to RWD started to appreciate the value it brings.

Ryan: What are some of the ways in which multimodal data has been underutilized? And how is this driving your thinking on how to use this data moving forward?

Shahed: Use of RWD and epidemiology aren’t new in the industry, but in the last decade, we’ve seen generational changes with RWD in terms of quantity, quality, and dimension. This shift was enabled by digitization of health records, technological advancement in the form of cloud computing capabilities, and policies that are favorable toward the use of RWD in regulatory settings. The availability of clinico-genomic data now provides us with the opportunity to use RWD in drug discovery and early development setting. There is much to be gained with this approach. Ninety percent of molecules in clinical development don’t make it to approval, but as research showed, molecules whose development are informed by genetics have twice as high a chance of being approved.

We can use multimodal data in earlier stages of drug development than we do today. This is a nascent space so there are challenges to using these datasets and integration into existing processes, but we can use them in target validation work, translational biomarker efforts, indication expansion, predictive biomarker discovery, combination strategy, and many other opportunities.


Ryan: Looking ahead, what are you most excited about in terms of evolving the ways that multimodal data can be leveraged?

Scott: In the next few years, I hope to see our approach to incorporating RWD and multiomic datasets become the standard for drug discovery and development – particularly in early stages, where it’s less frequent now.

From a diagnostic perspective, I also believe data may help us understand the patient journey when particular lab tests or diagnostics are utilized, to inform future IVD development programs, but also to see if development of EMR-based clinical decision support tools may be beneficial. We’re seeing more and more of this.

I think use of RWD in the pathology setting will continue to expand. This is currently something we are exploring at Gilead. Having access to scanned images for all patients in a real-world dataset would be ideal, and this is clearly an area where artificial intelligence (AI) is making an impact.

Effort should also be made to increase the diversity of patients represented in these datasets, so they’re inclusive of patients from a wide cross-section of the patient community we want to serve. As a community, we need to think about how to expand these datasets in this way. It will provide an enormous benefit when developing drugs for patients.


Ryan: Can you share some of the learnings you have uncovered while building your organization and your cross-functional approach to leveraging multimodal real-world datasets?

Scott: In any new field, when you’re going to be working with external partners with specific expertise, you really need to have matched internal expertise. We knew we needed to bring in the right expertise to enable successful and mutually beneficial engagement with partners. Ensuring internal teams are educated on how to use RWD is key, and it reduces the friction they experience being involved in this work. You have to have the right expertise, but you also have to have people believe in this approach.

What’s critical is to start with efforts that you believe can demonstrate value, and carefully design the questions to ask, to ensure that, if they’re successfully addressed, they’re going to add value to the programs in support of the patients we serve.

Shahed: Organizational readiness is key. To really integrate using multimodal clinico-genomic data in biomarker discovery, or early drug development, and to do it at scale, requires a substantial resource commitment from the organization. You need champions and leadership with a vision who can engage stakeholders in a meaningful way.

All of this requires multiple iterations and continuous communication. As Scott described, it is necessary to demonstrate value through projects, and build up from there. And you will also need people who can speak languages that can resonate across different disciplines.


Ryan: What are some of the most exciting emerging data modalities you are seeing on the horizon in precision medicine?

Scott: As I mentioned earlier, digital pathology algorithms are going to be key, particularly as they are integrated into clinical decision support tools. Using multiomic datasets to inform the training and evaluation of these technologies will also be important.

I also believe the application of liquid biopsy, molecular residual disease, and residual disease monitoring will become more important. And, of course, the use of multi-cancer early detection methodologies. For all of these, large datasets that include outcomes and treatment information will be needed to prove their value to health care systems and patients by directing therapy where it can have most benefit. Expanded use of RWD will help capture the ninety-plus percent of patients not participating in clinical trials.


Ryan: What are some challenges you have observed in using clinico-genomic multimodal data?

Shahed: One challenge is a lack of standard methodology when combining these datasets. We have genomic data standards. We have good pharmacoepidemiology practice guidelines. But when we combine the two different modalities, what are the methodological implications? What are the biases of the confounding factors we should be concerned about? There is an opportunity for improvement here – to conduct more research and reach a greater consensus.

Another challenge comes about when sample sizes become too small. But there is power in descriptive statistics. Stratifying the patient population, looking at different variables and descriptive purposes can be inputs to informatics when working with smaller sample sizes.

Ryan: I’m glad you brought this up – at Tempus, we ensure that we undergo a feasibility stage to inform our work with partners. We’ve talked about the growth of datasets over time – if you rewind the clocks to three years ago, we could not move ahead with some projects due to low-end sample sizes. Now, in some cases, some of this feasibility work can be redone, and today we have sufficient sample sizes.
Scott: That was why, recalling the advanced non-small cell lung cancer dataset we discussed earlier, we wanted to look at primary resistance. Because of challenges related to the differences between RWD and clinical trial outcomes measures, and as we tried to understand how we could extract the necessary information from RWD, ineffective therapy as measured by primary resistance became a clear indicator. This allows one to focus the question on the key challenge, many effective therapies are still ineffective in a large fraction of patients. It can be helpful to boil things down – focusing on your initial objective, start with a simpler question and build additional research questions from there.

Learn more

Applications for multimodal real-world data

Learn more

Applications for multimodal real-world data

De-risking clinical trials and unlocking new knowledge about diseases


Related Content

View more