Q&A: Using RWD to bridge discovery and development in translational research

10/16/2025

Q&A: Using RWD to bridge discovery and development in translational research

Katie Igartua, PhD, VP of Translational Research, discusses how Tempus leverages real-world data, AI, and systems biology to accelerate drug discovery and development. She shares guidance on biomarker discovery, clinical trial support, and the future of precision medicine.

Authors Katie Igartua, PhD
VP of Translational Research, Tempus

Translational research bridges scientific discovery with clinical practice, a critical phase where many promising therapies fail. To navigate this “valley of death,” researchers are increasingly turning to multimodal real-world data (RWD) and systems biology to support the identification of mechanistic insights that could predict therapeutic response and resistance. Katie Igartua, PhD, Vice President of Translational Research at Tempus, discusses how her team applies these advanced computational methods to accelerate the development of precision medicine.

What are your team’s strategic priorities in translational research?

Our primary focus is on connecting complex biological processes with large-scale data analysis, which in turn informs our selection of computational methods. We prioritize building interpretable, biology-forward models rather than focusing solely on predictive power that might lack biological context. This systems biology approach can reveal the underlying mechanisms of therapeutic response and resistance, providing drug developers with the information needed to create more precise and effective therapies.

By analyzing longitudinal, multi-omic de-identified data from individual patients and combining it with insights from broader patient populations, we can subtype patients based on their molecular, immune, and clinical features. This allows us to identify which patient cohorts are most likely to benefit from a specific treatment and where combination therapies might be more effective, ultimately supporting more effective clinical trial design.

“This systems biology approach can reveal the underlying mechanisms of therapeutic response and resistance, providing drug developers with the information needed to create more precise and effective therapies.”

– Katie Igartua, VP of Translational Research, Tempus

How does your team typically collaborate with life sciences partners?

Our collaborations are designed to transform a partner’s research question into a data-driven, analytical project. The structure of these engagements varies based on the research need. For example, a company evaluating multiple indications for a pan-cancer biomarker will require broad access to our database, while a team analyzing a specific clinical trial population will need a more precisely defined cohort. In other cases, developing a custom signature may call for a dedicated scientific partnership with shared data and close collaboration. In each scenario, our team works to define, analyze, and translate RWD from patient cohorts into clinically relevant insights.

Can you describe a recent example where this approach enabled a discovery in biomarker or therapeutic target research?

We recently completed a project using Tempus Loop, a platform that integrates RWD, patient-derived organoids (PDOs), and AI to rapidly identify and validate actionable targets. Our team began by screening standard-of-care antibody-drug conjugates (ADCs) against Tempus’ PDO repository. This work led to the discovery of a unique RNA-based, pan-cancer gene signature of response to enfortumab vedotin, a Nectin-4 ADC.

We then validated this signature in our RWD using a de-identified cohort of patients with bladder cancer and found that the signature was significantly associated with real-world progression-free survival. By combining wet- and dry-lab workflows, we were able to discover key driver genes, translate that finding into RWD, and validate a signature that can potentially be used to identify new indications or select patients for a clinical trial.

What role does Tempus’ RWD play in developing companion diagnostics (CDx) and supporting clinical trials?

Our RWD provides a data-driven framework for identifying, selecting, and refining biomarkers. The Nectin-4 ADC signature is an example of a discovery that could potentially be validated for use in enrolling patients into clinical trials and, if supported by further evidence, developed as a CDx. Before a biomarker can be advanced as a potential CDx, our team uses our data to confirm it is representative across different treatment journeys and standard-of-care populations. This work is fundamental to matching therapies with appropriate patient populations, and we see significant potential for novel RNA and digital pathology biomarkers to advance this effort.

What are the biggest challenges in integrating diverse data types for translational research, and how does Tempus address them?

A core challenge in working with RWD is the continuous evolution of assay designs, data pipelines, and reference genomes. To address this, Tempus has built modular pipelines that help us harmonize previously sequenced data with our current standards.

Another challenge is the lack of certain data types on a single platform. Scientists often rely on public datasets like GTEx for normal tissue comparisons, but this data often comes from post-mortem samples and was prepared using various methodologies, which requires significant batch correction. To mitigate this, we focus on integrating public and proprietary datasets within our large-scale model architecture and benchmarking tasks to support generalizability and reliability of our models across different data sources.

How does your team incorporate systems biology approaches when building predictive models?

We focus on selecting the right datasets and modeling frameworks to enrich for the biological mechanisms we are studying. For example, we may use single-cell or modeling lab datasets to identify genes with evidence across multiple data sources, which helps us build robust, replicable signatures. When using RWD, we also account for technical and biological biases, such as different tumor purity levels or RNA expression from different metastatic tissues. We model and regress out this signal to enrich for pathways and biological interpretations that are most relevant.

How do you design and validate machine learning models for predicting drug response?

Our validation frameworks depend on the model’s intended use. For research applications, we use a traditional framework with training, testing, and validation cohorts to help prevent overfitting and ensure reproducibility.

When developing algorithmic tests for clinical practice, the validation process is significantly more rigorous. The selection of clinical cohorts and validation sets goes through a regulated process to help confirm the model is representative of the right population for the specific clinical question. This helps us understand how our algorithm performs in a true patient population as treatments and patient journeys evolve.

How do you ensure scientific rigor and reproducibility when working with RWD?

Our computational biology work follows established processes to ensure quality and reproducibility. All code undergoes peer review, and we use practices like paired coding to ensure our work is reproducible. Our team has also launched a set of R packages that sit on top of our data repository. These packages are built by experienced scientists and co-developed with data engineers to deploy best practices for using our data. This is critical for ensuring that both our internal research teams and external partners have the documentation and standardized tools needed to apply methods correctly and consistently.

Looking ahead, what scientific trends are you most excited about for the future of translational research?

I am particularly excited by the growing availability of spatial transcriptomics, single-cell data, and digital pathology. This brings the promise of true multimodality, allowing us to understand what is happening at a mechanistic level within a tissue.

Ultimately, I envision a future where foundation models can integrate clinical data alongside DNA, RNA, and spatial context from a patient’s H&E slide. This would give us a comprehensive picture of what is happening in the tumor at a very early stage. With this information, a physician could better predict which biomarkers a patient is likely to have, which IHC tests they may be positive for, and whether they should get sequencing, enabling more informed and timely treatment decisions.

To learn more about Tempus’ data solutions, click here or contact us.