Volume - 7 | Issue - 4 | december 2025
Published
04 November, 2025
This research provides a reproducible comparative analysis of the performance of six independent machine learning classifiers in predicting in-hospital mortality among ICU patients from the PhysioNet/Challenge-2012 dataset. The term 'single' in the title of the former evoked the expectation that the current work would deal with various models. The paper discusses the single-model classifiers SVM, LR, RF, XGB, MLPClassifier, and a Keras-based Neural Network, comparing their performance, calibration, and interpretability against a strict set of pipelines. Finally, the most remarkable contributions include a workflow diagram that includes information on all processes; the hyperparameter search space, early-stopping hyperparameter, and random seeds; preprocessing and imputation experiments comparing the mean, median, KNN and Iterative imputation; feature selection with the help of Random-Forest RFE, using a certain stopping rule that disregards the frequency of stability, triangulation of predictor importance by SHAP and permutation importance; current confidence intervals (CIs) and significance tests; and subgroup analyses based on age, sex, and severity. Findings indicate that XGBoost has high discrimination and calibration statistics compared to the other classifiers; statistically significant ROC-AUC and Brier score improvements are obtained in favor of this algorithm. Every performance statistic is followed by 95% CIs; calibration curves, learning curves, and data regarding runtime assessment are provided.
KeywordsICU Mortality XGBoost Calibration RFE Stability SHAP Reproducibility.