Reducing Manual Chart Review in Oncology: Validation of an AI-Based Monitoring Tool for Lung Cancer Staging and Biomarker Tracking

ASCO 2026

May 21, 2026
Oncology
Abstract

Katie Mo, Tian Kang, Journey Penney, Jessica Dow, Chithra Sangli, Noah Zimmerman, Riccardo Miotto

Background
Continuous monitoring of evolving staging and molecular data is essential for clinical decision support to capture shifts in an oncology patient’s care journey. However, current systems struggle to synchronize fragmented electronic health records and extensive clinical documentation. This leaves clinicians with an incomplete picture, hindering their ability to follow complex and dynamic NCCN protocols without manual chart reviews.

 

Methods
We implemented an AI-based continuous monitoring system (AI-CM) that uses embeddings from large language models, fine-tuned on clinical notes, to extract clinical events and flag patients whose care deviates from NCCN guidelines. Monitoring begins at the initial oncology ICD code, where the system tracks follow-ups that determine stage, histology, and site (diagnostic features). Subsequently, it continuously monitors documents for diagnosis-specific NCCN-recommended biomarker testing (ordered or performed). We validated AI-CM in a manually annotated cohort of 7,259 patients across eight institutions (2024–2025). The evaluation focused on biomarker testing for early-stage (n = 1,155) and advanced (n = 4,448) non-small cell lung cancer (eNSCLC, aNSCLC), specifically targeting EGFR, ALK, PD-L1, and NGS.

 

Results
AI-CM identified eNSCLC and aNSCLC diagnostic features with precision of 0.77 and 0.85 and recall of 0.91 and 0.94, respectively, and biomarker testing with precision of 0.75 and recall of 0.81. Incorporating monitoring for diagnostic features reduced false positive diagnoses by 25% relative to relying solely on ICD codes. The median time to diagnosis was 41 days for eNSCLC and 16 days for aNSCLC, with a median time to test from diagnosis of 9 days. Within 6 weeks of the initial ICD code and diagnosis, AI-CM identified diagnosis and biomarker testing, on average, for 82% and 88% of patients, respectively (Table 1). This enables flagging for full review of a much smaller proportion of patients (18%) with a missed diagnosis, or who are non- adherent (12%) to NCCN guidelines for testing. By reviewing only the AI-recommended document to confirm diagnosis for the 89% of patients correctly identified as aNSCLC, and all documents for the remaining patients, the average number of documents requiring review within the first 6 weeks to determine these diagnoses was reduced from 5.5 to 1.3 per patient (a 76% reduction).

 

Conclusions
AI-based continuous monitoring improves timely identification of patients deviating from guidelines and reduces manual review by prioritizing documents for human verification.

 

Table 1. Performances of AI-CM at 6 weeks from the previous clinical event.

 

TargetPatients Identified (%)Mean Documents to Review per PatientAI-CM Mean Documents to Review per Patient
AI-CM Documents to Review Reduction (%)
eNSCLC767.32.467
aNSCLC895.51.376
eNSCLC Testing895.21.571
aNSCLC Testing887.51.876