Background Given the high global mortality burden of chronic heart failure (CHF) and the limitations of traditional risk prediction tools in accuracy and comprehensiveness, along with the potential of machine learning (ML) to improve prediction performance and the ability of a health ecology framework to systematically identify multi-dimensional risk factors, we aimed to develop an ML-based mortality risk prediction model for CHF and analyse its risk factors using a health ecology framework. Methods We enrolled 489 CHF patients from the Jackson Heart Database, with all-cause mortality during a 10-year follow-up period designated as the outcome measure. Guided by a five-layer health ecology framework (individual traits, behavioural characteristics, interpersonal relationships, work/living conditions, and macro policies), we selected 58 variables for analysis. The cohort was split into 7:3 training/validation sets. Random forest (RF) and k-nearest neighbour (KNN) models identified mortality predictors after five oversampling techniques addressed data imbalance before modelling. We trained seven ML algorithms, validated them via 10-fold cross-validation, and compared them using accuracy, the area under the curve (AUC), and other metrics. Results We identified 24 key factors: 19 for individual traits (age, body mass index (BMI), antihypertensive medication, hypoglycaemic medication, antiarrhythmic medication, systolic blood pressure, glycated haemoglobin, glomerular filtration rate, left ventricular ejection fraction, left ventricular diastolic diameter, left ventricular mass, high-density lipoproteins, low-density lipoproteins, triglycerides, total cholesterol, cardiovascular surgical history, mitral annular early diastolic peak velocity of motion); three for individual behavioural characteristics (dark greens intake, egg intake, and night-time sleep duration); and two for living and working conditions (favourite food shop at three-kilometre radius, proportion of poor people in the place of residence). The model constructed using synthetic minority over-sampling technique combined with edited nearest neighbours (SMOTE-ENN) processing and applying extreme gradient boosting (XGBoost) model was optimal, with an accuracy of 81.58%, an AUC value of 0.83, a precision of 0.87, a recall of 0.84, and an F1 value of 0.86 for the prediction of mortality at 10-year follow up. Conclusions We systematically categorised CHF mortality risk factors by integrating health ecology theory and ML. The SMOTE-ENN and XGBoost model demonstrated high accuracy, though further optimisation is needed to enhance clinical utility in CHF risk prediction.
Machine learning-based risk factor analysis and prediction model construction for mortality in chronic heart failure
Qian Xu,Ruicong Yu,Xue Cai,Guanjie Chen,Yueyue Zheng,Cuirong Xu,Jing Sun
Published 2025 in Journal of Global Health
ABSTRACT
PUBLICATION RECORD
- Publication year
2025
- Venue
Journal of Global Health
- Publication date
2025-09-12
- Fields of study
Medicine, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar, PubMed
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-61 of 61 references · Page 1 of 1
CITED BY
Showing 1-1 of 1 citing papers · Page 1 of 1