Development and validation of a deep learning-based microsatellite instability predictor from prostate cancer whole-slide images

npj Precision Oncology MANUSCRIPT
Authors Qiyuan Hu, Abbas A. Rizvi, Geoffery Schau, Kshitij Ingale, Yoni Muller, Rachel Baits, Sebastian Pretzer, Aïcha BenTaieb, Abigail Gordhamer, Roberto Nussenzveig, Adam Cole, Matthew O. Leavitt, Ryan D. Jones, Rohan P. Joshi, Nike Beaubier, Martin C. Stumpe & Kunal Nagpal


Microsatellite instability-high (MSI-H) is a tumor-agnostic biomarker for immune checkpoint inhibitor therapy. However, MSI status is not routinely tested in prostate cancer, in part due to low prevalence and assay cost. As such, prediction of MSI status from hematoxylin and eosin (H&E) stained wholeslide images (WSIs) could identify prostate cancer patients most likely to benefit from confirmatory testing to evaluate their eligibility for immunotherapy and need for Lynch syndrome testing. Prostate biopsies and surgical resections from prostate cancer patients referred to our institution were analyzed. MSI status was determined by next-generation sequencing. Patients sequenced before a cutoff date formed an algorithm development set (n = 4015, MSI-H 1.8%) and a paired validation set (n = 173, MSI-H 19.7%) that consisted of two serial sections from each sample, one stained and scanned internally and the other at an external site. Patients sequenced after the cutoff date formed a temporally independent validation set (n = 1350, MSI-H 2.3%). Attention-based multiple instance learning models were trained to predict MSI-H from H&E WSIs. The predictor achieved area under the receiver operating characteristic curve values of 0.78 (95% CI [0.69–0.86]), 0.72 (95% CI [0.63–0.81]), and 0.72 (95% CI [0.62–0.82]) on the internally prepared, externally prepared, and temporal validation sets, respectively, showing effective predictability and generalization to both external staining/scanning processes and temporally independent samples. While MSI-H status is significantly correlated with Gleason score, the model remained predictive within each Gleason score subgroup.