基础医学与临床 ›› 2024, Vol. 44 ›› Issue (6): 845-852.doi: 10.16352/j.issn.1001-6325.2024.06.0845

• 临床研究 • 上一篇    下一篇

基于机器学习算法的心力衰竭预测模型

胡川丽1, 贺晓松2*, 赵江3, 李华1   

  1. 陆军军医大学第二附属医院 1.麻醉科;3.泌尿科 重庆 400037;
    2.重庆工程学院 大数据与人工智能学院,重庆 400037
  • 收稿日期:2023-11-13 修回日期:2024-01-23 出版日期:2024-06-05 发布日期:2024-05-24
  • 通讯作者: *740322560@qq.com
  • 基金资助:
    国家自然科学基金(81974101)

Heart failure prediction model based on machine learning algorithms

HU Chuanli1, HE Xiaosong2*, ZHAO Jiang3, LI Hua1   

  1. 1. Department of Anesthesiology; 3.Department of Urology, the Second Affiliated Hospital of Army Medical University, Chongqing 400037;
    2. School of Big Data and Artificial Intelligence, Chongqing Institute of Engineering, Chongqing 400037, China
  • Received:2023-11-13 Revised:2024-01-23 Online:2024-06-05 Published:2024-05-24
  • Contact: *740322560@qq.com

摘要: 目的 4种机器学习算法构建心力衰竭风险预测模型,为早期发现病患和治疗干预提供理论支撑。方法 通过对Kaggle社区上发布的心力衰竭数据集预处理后,使用特征选择筛选出与心衰的相关因素作为预测指标,选择逻辑回归、决策树、AdaBoost、XGBoost 4种机器学习算法建立预测模型。对其准确率、精准率、召回率、F1-Score、ROC曲线下面积(AUC)进行对比分析,以验证模型性能。结果 研究分析了918例心衰患者的11种特征,筛选出10个特征因子纳入建模。经网格搜索方法超参数调优后,XGBoost模型表现最优,准确率、精准率、召回率、f1_score、AUC值分别为87.5%、90.38%、89.71%、90.04%、0.93。另外,数据分析显示运动ST段坡度、运动性心绞痛、胸痛类型、年龄为心力衰竭的主要影响因子。结论 XGBoost模型对心力衰竭预测性能最佳,机器学习算法能为心力衰竭的早期防控及诊断提供参考依据。

关键词: 心力衰竭, 机器学习, 预测

Abstract: Objective To construct a model of heart failure risk prediction based on four machine learning algorithms in order to support early diagnosis and intervention. Methods After reviewing the heart failure dataset published on the Kaggle community, feature selection was used to select relevant factors related to heart failure as predictive indicators. Four machine learning algorithms, namely logistic regression, support vector machine, random forest, and XGBoost were selected to establish predictive models. Compared and analyzed its accuracy, precision, recall, F1 score and area under the ROC curve (AUC) to verify the performance of the model. Results The study analyzed 11 features of 918 patients with heart failure and selected 10 feature factors for modeling. After optimizing the hyper-parameters through grid search, the XGBoost model performed the best, with accuracy, precision, recall, and f1_score and AUC values were 87.5%, 90.38%, 89.71%, 90.04% and 0.93, respectively. In addition, data analysis showed that exercise ST slope, chest pain type, and exercise induced angina were main influencing factors for heart failure. Conclusions The XG Boost model has the best predictive tool for heart failure, and machine learning algorithms may support early prevention, early diagnosis as well as control of heart failure.

Key words: heart failure, machine learning, prediction

中图分类号: