Machine Learning-Based Cluster Analysis of Patients with Significant Tricuspid Regurgitation Reveals Distinct Population With Different Phenotypes and Clinical Outcomes

American Society of Echocardiography Presentation
Authors Bliss Uribe, Shizhen Liu, Arati Gurung, Pradeep Yadav, Vivek Rajagopal, Amy Simone, Miguel Sotelo, Loren Wagner, Chris Rogers, Vibhav S Rangarajan, Peter Flueckiger, and Mani Vannan


Survival, morbidity, and response to therapy is heterogeneous in patients with significant tricuspid regurgitation (TR). The complex interplay of multiple clinical and imaging factors is a key determinant of risk assessment, and decision-making about therapy. Machine learning approaches can unravel these complex relationships and provide better understanding of the disease. 


  • 1,216 patients with more than moderate TR (506 male, mean 71 years) between 2014 and 2017
  • K-Means clustering (K=3) variable selection reduced sample size to 854 patients: of 73 clinical/laboratory/echocardiographic variables, 12 continuous variables were selected based on:
    • Data completion (>=90%)
    • Significantly related to adverse clinical outcome (death or hospitalization) in a univariate analysis
    • Variables with high relationship (information similarity) based on Cramer’s V coefficient (V >= 04.) and linear regression (R2>0.7)
  • Patient comorbidities and clinical outcome rates were compared between the clusters using chi squared and Cox proportional hazards model



  • 550 (45%) of TR patients died (median 172 days, min 0 days, max 2,138 days), and overall one-year mortality rate was 28%. 789 (65%) of the patients had an adverse clinical outcome, and 49% had the event within one year.
  • There was a high burden of comorbidities: 35% had NYHA III or IV, 59% had atrial fibrillation, 68% had dyspnea, 35% had a pacemaker/ICD. Pacemaker/ICD lead (OR 2.1, p<0.001), dyspnea (OR 1.7, p<0.001), and NYHA III or IV (OR 2.2, p<0.001) were independently associated with adverse clinical outcomes.
  • Also, lower TAPSE (OR 0.53, p<0.001), LVEF (OR 0.98, p<0.001), and higher TR EROA (OR 1.5, p=0.045 and RVSP (OR 1, p=0.01) were independently associated with adverse clinical outcomes. 
  • K-means clustering grouped patients into three groups based on continuous variables.
  • Patients in cluster 3 (n=119) had higher rates of adverse clinical outcomes (71%, vs. 68% and 60%) than patients in cluster 2 (n=367) and cluster 1 (n=368), respectively. (Figure B) The estimated odds of adverse clinical outcome in cluster 3 is 66^ higher than in cluster 1 (OR 1.7, p=0.016), and marginally higher (18%) than in cluster 2 (p=02.7). Cluster 2 patients were also significantly more likely to have an adverse clinical outcome than cluster 1 patients (OR 1.4, p=0.017).
  • The key features of the three phenogroups are shown in Figure C. The gray boxes show the rate of prevalence of these features in each of the clusters.



Significant TR is associated with considerable mortality in the short and long-term. Machine-learning based unsupervised clustering identifies discrete clinical phenogroups with distinct clinical outcomes. This may aid in timing and selection of the type of therapy, and even determine futility of therapy.