In oncology, overall survival (OS) is the gold standard. It is the definitive, unbiased measure of a therapy’s clinical benefit and the anchor for real-world evidence (RWE) that supports regulatory decisions and informs clinical guidelines. The integrity of this evidence, however, depends entirely on the quality of the underlying data.
The challenge is that patient mortality data is notoriously fragmented, creating a significant risk of bias in research. Here, Emilie Scherrer, Senior Director and Head of Outcomes Research at Tempus, explains how the Tempus team built and rigorously validated a composite mortality endpoint to solve this fundamental challenge and provide a reliable foundation for the future of precision medicine research.
Q: Why is accurate mortality data so critical for oncology research, and what are the risks of getting it wrong?
|
| Emilie Scherrer: We really need a complete picture of mortality, because incomplete capture of death events can introduce bias and lead to inaccurate real-world overall survival estimates. For instance, if you are just using EHR data and a patient is lost to follow-up, you might assume they are still alive. If they have actually passed away, the outcomes will be skewed, and you can end up with an artificially enriched survival outcome for that patient. When we are generating RWE, it is critical to have the most complete and accurate dataset available to avoid drawing flawed conclusions about a drug’s effectiveness. |
Q: What are the most persistent challenges researchers face when trying to assemble a complete picture of patient mortality?
|
| Emilie Scherrer: The primary challenge is data fragmentation and missingness. In the U.S., a patient’s journey can span multiple health systems, so a single source of truth is rare. Relying on EHR data alone captures only a proportion of deaths confirmed by the National Death Index (NDI), which is used as the gold-standard benchmark. When patients transfer their care, they become lost to follow-up, and their death events are missed. While the NDI is comprehensive, the process to access it is difficult and involves long reporting delays, which can slow down the pace of discovery. Similar findings occur when you rely only on claims data. That’s why it was critical for us to build a composite endpoint to capture a more complete picture of mortality. |
“The primary challenge is data fragmentation and missingness. In the U.S., a patient’s journey can span multiple health systems, so a single source of truth is rare.”
– Emilie Scherrer, Senior Director, Head of Outcomes Research
|
Q: You are combining data from many different sources. How do you ensure patient privacy is protected throughout this process?
|
| Emilie Scherrer: Protecting patient privacy is paramount. We use a privacy-preserving, deterministic tokenization process to link these de-identified datasets at the patient level. Each organization, including Tempus and our data partners, uses an irreversible cryptographic process to generate a unique, non-identifying code, or “token,” from personally identifiable information. This process is deterministic, meaning the same information will always generate the same token, but it prevents the original personal information from being reconstructed. This allows us to create integrated, de-identified patient records for research without compromising patient privacy. |
Q: You validated this composite variable against the National Death Index (NDI). Why was using the NDI as the benchmark so important?
|
| Emilie Scherrer: To build trust with our customers, we need to rely on a known, authoritative data source. We can’t just say our data is high-quality; we have to actually prove it. The NDI is maintained by the National Center for Health Statistics and is considered the gold standard for mortality information in the U.S. Benchmarking against this gold standard is the most rigorous way to assess the performance of our own endpoint and provide our customers with confidence in the data. |
“We can’t just say that our data is high quality. We have to actually prove it. By benchmarking against the NDI gold standard, we’re able to build that trust with our customers.”
– Emilie Scherrer, Senior Director, Head of Outcomes Research
|
Q: What were the key performance metrics from the validation study, and what do they tell us about the reliability of the Tempus mortality endpoint?
|
Emilie Scherrer: The study included a large cohort of over 17,500 patients with advanced cancer and demonstrated high performance across several key metrics:
- Sensitivity of 82%: This means our composite variable successfully identified 82% of all deaths recorded in the NDI.
- Positive Predictive Value (PPV) of 96%: This is a critical metric. It means that when our dataset identifies a patient as deceased, it is correct 96% of the time. This high reliability prevents the misclassification of living patients.
- Specificity of 95%: This shows a very low rate of false positives, reinforcing the accuracy of our death classifications.
- Date Agreement of 96%: The date of death in our dataset matched the NDI date within a ±30-day window 96% of the time, which is crucial for accurate survival calculations.
|
Q: The validation study mentions some variation in performance across different patient subgroups. Could you elaborate on that and what it means for researchers?
|
| Emilie Scherrer: That’s an important point. While the overall performance is strong, we did observe some variability. For example, sensitivity was slightly lower for patients with cancer types that have longer expected survival, like breast and colorectal cancer, compared to those with shorter survival, like pancreatic and non-small cell lung cancer. We hypothesize this is because patients with longer survival have more time and opportunity to become lost to follow-up from their original EHR system, making their records harder to link over time. We also saw variations by race and geography, which may be influenced by complex factors like challenges in harmonizing patient names across different cultural contexts or regional differences in end-of-life practices. Understanding these nuances helps researchers design more robust analyses. |
Q: The study also mentions a “cumulative cases/dynamic controls” analysis where sensitivity was much higher, around 96-98%. Can you explain what that means in practical terms for a survival analysis?
|
| Emilie Scherrer: This is key to understanding the real-world utility of our data. A standard sensitivity calculation treats any death we don’t capture as a “false negative.” However, in a survival analysis, patients aren’t just “dead” or “alive”—they can be “censored” if they are lost to follow-up. The cumulative cases/dynamic controls approach mimics a real survival analysis by only looking at patients with sufficient follow-up time. The high sensitivity in this analysis—96% at 6 months and 98% at 24 months—shows that the patients whose deaths we don’t capture are appropriately censored as lost to follow-up, rather than being misclassified as alive. This is the correct methodology, and it ensures that overall survival calculations are not artificially inflated. |
Q: How should life sciences companies interpret these results when considering Tempus data for their own research?
|
| Emilie Scherrer: These results give our customers confidence to use the Tempus composite mortality variable as a reliable endpoint for their overall survival and real-world progression-free survival analyses. The high PPV assures researchers that the death events they are analyzing are real, which de-risks their research. Furthermore, the fact that we appropriately censor patients lost to follow-up means you can trust the survival curves generated from our data. Ultimately, this validation enables our customers to generate high-quality RWE that can withstand scientific and regulatory scrutiny. |
Q: Looking forward, how does a validated, high-quality mortality endpoint unlock new possibilities for precision medicine?
|
| Emilie Scherrer: Having this reliable endpoint is instrumental to any research. Specifically, because we have such rich molecular information at Tempus, combining that with clinical outcomes, including this mortality data, allows us to look at very specific, molecular-driven populations. For example, a life science company may want to understand if patients with a certain mutation have longer or shorter survival compared to the overall indication. By bringing in this robust mortality data, we can analyze overall survival stratified by a specific biomarker and use the data as a benchmark for a clinical trial targeting that specific mutated population. It provides the tools to generate reliable RWE quickly and accelerate the entire innovation cycle. |
Q: The validation was based on data through 2022. Are there plans to update this validation as the Tempus database grows?
|
| Emilie Scherrer: Yes, absolutely. This is not a one-time study. As our dataset continues to grow and we include more EHR integrations, we plan on conducting follow-up analyses. This will allow us to confirm the continued robustness of the endpoint and track any improvements over time. It’s part of our commitment to maintaining the highest standards of data quality for our customers. |
References
-