Prediction of MET Overexpression in Lung Adenocarcinoma from Hematoxylin and Eosin Images

The American Journal of Pathology Manuscript
Authors Kshitij Ingale, Sun Hae Hong, Josh S.K. Bell, Abbas Rizvi, Amy Welch, Lingdao Sha, Irvin Ho, Kunal Nagpal, Aïcha Bentaieb, Rohan P. Joshi, and Martin C. Stumpe

Mesenchymal epithelial transition (MET) protein overexpression is a targetable event in non-small cell lung cancer and is the subject of active drug development. Challenges in identifying patients for these therapies include lack of access to validated testing, such as standardized immunohistochemistry assessment, and consumption of valuable tissue for a single gene/protein assay. Development of prescreening algorithms using routinely available digitized hematoxylin and eosin (H&E)-stained slides to predict MET overexpression could promote testing for those who will benefit most. Recent literature reports a positive correlation between MET protein overexpression and RNA expression. In this work, a large database of matched H&E slides and RNA expression data were leveraged to train a weakly supervised model to predict MET RNA overexpression directly from H&E images. This model was evaluated on an independent holdout test set of 300 overexpressed and 289 normal patients, demonstrating a receiver operating characteristic area under curve of 0.70 (95th percentile interval: 0.66 to 0.74) with stable performance characteristics across different patient clinical variables and robust to synthetic noise on the test set. These results suggest that H&E-based predictive models could be useful to prioritize patients for confirmatory testing of MET protein or MET gene expression status.