Assessing the Reliability, Accuracy, and Utility of Clinical Abstraction Methods From Unstructured Electronic Health Records (EHRs).

ASCO 2025

May 22, 2025
Oncology

Katie Mo, Xifeng Wang, Kaitlynn Cunnea, Bridget Bax, Maria A Berezina, Chelsea Kendall Osterman, Riccardo Miotto, Chithra Sangli

Background: Real-world oncology data integrates structured and unstructured EHR-based information relating to clinical characteristics, treatment patterns, and outcomes for patients. At Tempus, unstructured records are abstracted into structured fields through a uniform, rules-based, human curation process. We aim to measure the performance of our abstraction process by evaluating inter-abstractor reliability, accuracy compared to an independent oncologist, and the utility of Tempus (de-identified) abstracted data for estimating real-world outcomes.

 

Methods: Two randomly selected abstractors (blinded to study participation) independently abstracted the unstructured records of 222 advanced or metastatic non-small cell lung cancer patients (a/mNSCLC).Clinical variables were assessed in the demographic, diagnosis, third-party lab biomarker results, first line treatment (1L), and outcome data domains. A subset of 40 patients were reviewed by an oncologist. The primary measure of inter-abstractor reliability was Gwet’s agreement coefficient (AC). Categorical variables were assessed excluding and including missing data as a category for agreement. Date agreement was calculated for presence/absence, as well as exact, within ±15 days, and ±30 days. TheKaplan-Meier estimate of real-world progression free survival (rwPFS) on combination 1L platinum-based chemotherapy (PBC) and immunotherapy (IO) was derived in 2,980 a/mNSCLC patients diagnosed between 2018-2023.

 

Results: Gwet’s AC was high (≥0.82) between abstractors across demographic, diagnosis, biomarker, and treatment domains. Among the 181 patients where abstractors agreed on 1L class and initiation date within ±30 days, the agreement in progression presence and date was 0.83-0.93. Gwet’s AC was 0.96-1for death presence and date. Percent agreement was high ranging from 85%-100% between at least one abstractor and the oncologist among categorical variables and 80%-100% within ±30 days for date variables. Median rwPFS on 1L PBC and IO was 7.9 months in line with KEYNOTE-189 and -407. All patients with progression and a non-missing date of progression had a clinically relevant downstream event, 97% with 1L treatment end date, 100% with 2L treatment start date, and 69% with a deceased date.

 

Conclusions: These results demonstrate that the rules-based, human abstraction process as designed is reliable and accurate across the data domains commonly used in insight generation. The resulting data product has utility for estimating real-world outcomes.

 

Domain

AC(Min-Max)

Demographic (birth date, sex, race,ethnicity, smoking status)

0.96-1

Diagnosis (stage, histology, year ofdiagnosis)

0.87-0.99

Biomarker (EGFR, ALK, ROS1, PD-L1, BRAF, RET, NTRK)

0.87-1

Treatment (agents, class, dates)

0.82-0.97