A Comparison of Data Processing Methods To Support Patient Matching to Oncology Clinical Trials
ASCO 2026
Tunghi May Pini, Kunal Nagpal, Mark Vance, Caity Moran Rose, Michelle Huang, Michael Singer, Gianna Klonk, Ryan Godart, Tian Kang, Dan Sun, Arpita Saha, Chelsea Kendall Osterman
Background To pre-screen and match patients to trials at scale, technology tools are needed to support human workflows. These tools require numerous data inputs, including clinical concepts such as cancer diagnosis, stage, and histology. The more complete these data are, the better the accuracy of matches produced, and the more efficient human reviewers can be. While some clinical data are available from clinical system structured fields, most are within unstructured documentation. Methods including human abstraction, natural language processing (NLP) models, and large language model (LLM) agents can improve data completeness and accuracy beyond structured EHR and laboratory information management system (LIMS) data. We compared these methods to obtain diagnosis and stage data for Tempus Link, a tool that supports patient-trial matching at scale for a national oncology trials network.
Methods We randomly sampled patients with one of 4 previously abstracted cancer diagnoses: lung (LC), prostate (PC), colorectal (CRC), and breast cancer (BC). For LC we also examined histology (NSCLC vs SCLC). For each method (EHR, LIMS, NLP predicted, LLM agent), diagnosis, stage, and histology values were compared to the abstracted value, and accuracy was calculated as the percent of patients with a correct value among patients with an abstracted value available. Since these data are inputs into a tool used by RN screeners wh confirm matches before notifying a site, diagnosis was focused on accuracy in broad cancer diagnostic categories (e.g. neoplasm of lung). In addition, we assessed completeness as presence of a usable value across all patients in a cohort.
Results Completeness for EHR, NLP, and LLM was higher for diagnosis compared to stage, ranging from 91.7 – 100% vs 20 – 95.8%. LIMS completeness was lower for all variables across cohorts, ranging from 0 – 87.8%.
Conclusions Variability exists across data sources and processing methods in generating data inputs for trial matching. All methods have high completeness and accuracy for diagnosis, while there are significant gains when applying NLP and LLMs for stage and histology. To balance accuracy, matching efficiency, and cost, the use of enhanced data processing methods are required for certain variables, but may not be needed for others.
Table 1: Accuracy by cohort, variable, and data processing method
| Cohort | Variable | EHR | LIMS | NLP | LLM |
| LC. n=48 | # diagnosis correct | 48 (100%) | 33 (68.8%) | 48 (100%) | 45 (93.8%) |
| # NSCLC correct | 1 (2.1%) | 27 (56.3%) | 48 (100%) | 39 (81.3%) | |
| # stage correct (n=41) | 16 (39%) | 0 (0%) | 27 (65.9%) | 30 (73.2%) | |
| PC. n=48 | # diagnosis correct | 48 (100%) | 43 (89.6%) | 48 (100%) | 44 (91.7%) |
| # stage correct (n=45) | 22 (48.9%) | 0 (0%) | 15 (33.3%) | 38 (84.4%) | |
| CRC. n=49 | # diagnosis correct | 47 (95.9%) | 42 (85.7%) | 47 (95.9%) | 46 (93.9%) |
| # stage correct (n=46) | 21 (45.7%) | 0 (0%) | 36 (78.3%) | 42 (91.3%) | |
| BC. n=50 | # diagnosis correct | 50 (100%) | 9 (18%) | 47 (94%) | 50 (100%) |
| # stage correct (n=36) | 5 (13.9%) | 0 (0%) | 23 (63.9%) | 35 (97.2%) |
Related publications