Authors
Kshitij Ingale, Sun Hae Hong, Qiyuan Hu, Renyu Zhang, Bolesław L. Osinski, Mina Khoshdeli, Josh Och, Kunal Nagpal, Martin C. Stumpe, and Rohan P. Joshi
Molecular testing of tumor samples for targetable biomarkers is restricted by a lack of standardization, turnaround time, cost, and tissue availability across cancer types. Additionally, targetable alterations of low prevalence may not be tested in routine workflows. Algorithms that predict DNA alterations from routinely generated hematoxylin and eosin–stained images could prioritize samples for confirmatory molecular testing. Costs and the necessity of a large number of samples containing mutations limit approaches that train individual algorithms for each alteration. In this work, models were trained for simultaneous prediction of multiple DNA alterations from hematoxylin and eosin images using a multitask approach. Compared with biomarker-specific models, this approach performed better on average, with pronounced gains for rare mutations. The models reasonably generalized to independent temporal holdout, externally stained, and multisite The Cancer Genome Atlas test sets. Additionally, whole slide image embeddings derived using multitask models demonstrated strong performance in downstream tasks that were not a part of training. Overall, this is a promising approach to develop clinically useful algorithms that provide multiple actionable predictions from a single slide.
VIEW THE PUBLICATION