Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

The Application of Machine Learning in Predicting Mortality Risk in Patients With Severe Femoral Neck Fractures: Prediction Model Development Study

The Application of Machine Learning in Predicting Mortality Risk in Patients With Severe Femoral... Background: Femoral neck fracture (FNF) accounts for approximately 3.58% of all fractures in the entire body, exhibiting an increasing trend each year. According to a survey, in 1990, the total number of hip fractures in men and women worldwide was approximately 338,000 and 917,000, respectively. In China, FNFs account for 48.22% of hip fractures. Currently, many studies have been conducted on postdischarge mortality and mortality risk in patients with FNF. However, there have been no definitive studies on in-hospital mortality or its influencing factors in patients with severe FNF admitted to the intensive care unit. Objective: In this paper, 3 machine learning methods were used to construct a nosocomial death prediction model for patients admitted to intensive care units to assist clinicians in early clinical decision-making. Methods: A retrospective analysis was conducted using information of a patient with FNF from the Medical Information Mart for Intensive Care III. After balancing the data set using the Synthetic Minority Oversampling Technique algorithm, patients were randomly separated into a 70% training set and a 30% testing set for the development and validation, respectively, of the prediction model. Random forest, extreme gradient boosting, and backpropagation neural network prediction models were constructed with nosocomial death as the outcome. Model performance was assessed using the area under the receiver operating characteristic curve, accuracy, precision, sensitivity, and specificity. The predictive value of the models was verified in comparison to the traditional logistic model. Results: A total of 366 patients with FNFs were selected, including 48 cases (13.1%) of in-hospital death. Data from 636 patients were obtained by balancing the data set with the in-hospital death group to survival group as 1:1. The 3 machine learning models exhibited high predictive accuracy, and the area under the receiver operating characteristic curve of the random forest, extreme gradient boosting, and backpropagation neural network were 0.98, 0.97, and 0.95, respectively, all with higher predictive performance than the traditional logistic regression model. Ranking the importance of the feature variables, the top 10 feature variables that were meaningful for predicting the risk of in-hospital death of patients were the Simplified Acute Physiology Score II, lactate, creatinine, gender, vitamin D, calcium, creatine kinase, creatine kinase isoenzyme, white blood cell, and age. Conclusions: Death risk assessment models constructed using machine learning have positive significance for predicting the in-hospital mortality of patients with severe disease and provide a valid basis for reducing in-hospital mortality and improving patient prognosis. (JMIR Bioinform Biotech 2022;3(1):e38226) doi: 10.2196/38226 KEYWORDS machine learning; femoral neck fracture; hospital mortality; hip; fracture; mortality; prediction; intensive care unit; ICU; decision-making; risk; assessment; prognosis https://bioinform.jmir.org/2022/1/e38226 JMIR Bioinform Biotech 2022 | vol. 3 | iss. 1 | e38226 | p. 1 (page number not for citation purposes) XSL FO RenderX JMIR BIOINFORMATICS AND BIOTECHNOLOGY Xu et al However, there have been no definitive studies on in-hospital Introduction mortality or its influencing factors in such patients with severe FNF admitted to the ICU. Therefore, in this study, we used the Femoral neck fracture (FNF) accounts for approximately 3.58% electronic case information of FNF patients recorded in the of all fractures in the entire body [1], exhibiting an increasing MIMIC database to examine the factors of in-hospital mortality trend each year. According to a survey, in 1990, the total number in patients with FNF using a machine learning model to identify of hip fractures in men and women worldwide was indicators that are meaningful for predicting in-hospital mortality approximately 338,000 and 917,000, respectively [2]. In China, and to provide preventive measures to reduce in-hospital FNFs account for 48.22% of hip fractures [3]. mortality in patients as early as possible. The Medical Information Mart for Intensive Care (MIMIC) III database is a publicly available database commonly used in Methods clinical research [4], which contains medical data on approximately 60,000 patients in the intensive care unit (ICU) Data Source at Beth Israel Deaconess Medical Center from 2001 to 2012. Patient data from MIMIC-III were used for this study, which The ICU database is more dimensional, dense, and valuable in is a database commonly used in critical care big data studies; the field of medicine than the general patient electronic medical it contains clinical information such as demographics, vital record database [5]. The large amount of data recorded from signs, laboratory tests, treatment protocols, and diagnostic codes these treatments and examinations is conducive to the close for 46,520 patients in ICU. observation of ICU patients to detect physiological changes Ethical Considerations associated with deterioration and to provide more valuable data for clinical research [6]. The MIMIC-III database was approved by the Massachusetts Institute of Technology (Cambridge, MA) and Beth Israel Currently, many studies have been conducted on postdischarge Deaconess Medical Center (Boston, MA). The authors have mortality and mortality risk in patients with FNF [7-9]. Sheikh obtained the database download and use right through Protecting et al [8] used backward stepwise likelihood ratio Cox regression Human Research Participants Exam (No. 38335409). Therefore, model to comprehensively analyze the causes of death in patients the ethical approval statement and the need for informed consent with FNF fracture 30 days after surgery, and found that age, were waived for this manuscript. admission hemoglobin, and history of myocardial infarction were important influencing factors to increase mortality. Dhingra Inclusion and Exclusion Criteria et al [9] retrospectively analyzed the influencing factors of In this study, patients admitted to the ICU for FNFs were 1-year postoperative mortality in patients older than 60 years extracted from the MIMIC-III database according to their with FNF, and found that smoking, hypertension, diabetes, low diagnosis codes. The case information included in this study hemoglobin, elevated white blood cell count, and surgical delay was based on the first admission, and data from patients with (>1 week) were significantly associated with higher 1-year the first diagnosis code of FNF, including rotator fracture and postoperative mortality. Frost et al [7] used logistic regression intertrochanteric fracture, were selected according to the order model to determine the risk factors of postoperative nosocomial of diagnosis codes. Patients aged ≤18 years or with ICU length death in patients with FNF and used a nomogram model to of stay <24 hours were excluded, as were patients with grossly predict the risk of death in a short period of time. Studies showed incomplete medical data records (>50% numbers missing). The that age, gender, and complications were the main risk factors case screening process is shown in Figure 1. for nosocomial death in patients with femoral neck fracture. https://bioinform.jmir.org/2022/1/e38226 JMIR Bioinform Biotech 2022 | vol. 3 | iss. 1 | e38226 | p. 2 (page number not for citation purposes) XSL FO RenderX JMIR BIOINFORMATICS AND BIOTECHNOLOGY Xu et al Figure 1. Case screening flowchart. ICU: intensive care unit; MIMIC: Medical Information Mart for Intensive Care. achieve the balanced processing of the data set. The SMOTE Data Collection algorithm is implemented by randomly selecting a sample y Data were collected based on clinical experience, published from their k-nearest neighbors for each sample x in a relatively literature, and data recorded in the MIMIC III database. Data small number of mortality sample sets, and randomly collection for patients with FNFs was performed in the following synthesizing a new mortality sample on the x, y line. A total of 3 main areas: (1) demographic information—sex, age, BMI, 48 samples from the original mortality group were analyzed, length of ICU stay, history of previous illness, and Simplified and then 270 new mortality samples were randomly synthesized Acute Physiology Score II (SAPS II); (2) physiological and and added to the data set to finally obtain a new balanced data biochemical indices within 24 hours after admission to the set (mortality group: survival group = 1:1). ICU—serum calcium, hemoglobin, hematocrit, lactate, cardiac The linear function normalization method was used in this study troponin T level, creatine kinase (CK), creatine kinase to normalize the newly balanced data set. Commonly used isoenzyme (CKMB), vitamin D, red blood cells, white blood methods are linear function normalization (min-max scaling) cells, and creatinine; and (3) outcome—whether in-hospital and 0-mean normalization (z-score standardization). The death occurred after admission to the ICU in patients with normalization process is used to eliminate the computational critical FNFs. errors caused by different data levels and normalize the data to Data Preprocessing the range of 0-1 to ensure that each feature is treated equally by the classifier. The variables included in the study were screened to exclude cases with more than 50% missing values. For cases with no The normalized data set was randomly assigned to the test set more than 50% missing data, random forest (RF) algorithm was and the training set at a ratio of 7:3. Finally, 445 cases were used to impute variables containing missing values sequentially obtained for training the prediction model, and 191 cases were in a loop [10]. The common methods for filling missing data used to verify the predictive performance of the model. are the mean, plurality, median, and fixed value methods, and Model Construction the RF algorithm is a promising method for filling missing data. The missing values are used as new labels, and the model is Currently, logistic regression is one of the commonly used built to obtain predicted values for filling. The RF algorithm methods for identifying risk factors that predict the occurrence for filling in missing data is capable of handling mixed types of complications [11,12]. In an open calcaneal fracture study, of missing data and has the potential to scale up to big data compared to the traditional logistic regression model, machine environments. learning methods have 30% higher accuracy and are more suitable for clinical applications [13]. Since the outcome labels extracted in this study are unbalanced (48/366, 13.1% cases in the death group and 318/366, 86.9% RF is an integrated learning algorithm consisting of multiple cases in the survival group), the prediction results of the model decision trees formed by randomly adding back resampled trained by the machine learning algorithm are prone to bias for samples, which is suitable for problems where the number of the unbalanced data set; therefore, the original data set needs samples is much smaller than the number of features [14]. It to be balanced. In this study, the synthetic minority also has the advantages of robust effect, fast learning speed, oversampling technique (SMOTE) function in the “imblearn” strong generalization ability, and good classification library of Python (Python Software Foundation) is used to performance for missing data and imbalanced data [15]. https://bioinform.jmir.org/2022/1/e38226 JMIR Bioinform Biotech 2022 | vol. 3 | iss. 1 | e38226 | p. 3 (page number not for citation purposes) XSL FO RenderX JMIR BIOINFORMATICS AND BIOTECHNOLOGY Xu et al Backpropagation neural network (BPNN) is a feed-forward, and data cleaning, model construction, and performance and the most widely used, neural network [16]. The algorithm evaluation were performed using Python 3.8. All continuous has high self-learning and self-adaptive ability, strong variables are expressed as medians (quartiles), and count data generalization ability, and good prediction performance for are expressed as the number of cases (percentages). The untrained data. At the same time, the BPNN has high fault Mann-Whitney U test was used for univariate analysis of tolerance; that is, even if the system is damaged locally, it can continuous variables, and Fisher exact test was used for still work normally [17]. univariate analysis of categorical variables. The Pearson test was used for the analysis of variance of the machine learning Extreme gradient boosting (XGBoost) algorithm is a mainstream model results. P<.05 was considered to be a statistically machine learning algorithm based on tree model boosting [18]. significant difference. It continuously updates the error or residual of the model by adding tree models and then adjusts the weight of the The model evaluation indices were the area under the receiver misclassification results so that the model can select samples operating characteristic curve (AUROC), accuracy, precision, more intelligently and reduce the errors generated by the model. sensitivity, specificity, and F -score. The XGBoost algorithm has been widely used in clinical studies for predicting the occurrence of diseases and predicting adverse Results patient outcomes and has been shown to be more effective than other machine learning models in several studies [19-21]. Basic Characteristics of Patients With Severe FNFs Therefore, in this study, 3 algorithms, namely RF, BPNN, and A total of 366 eligible patients with FNF with a mean age of XGBoost, were used to construct machine learning prediction 78 (SD 20.4) years were screened. Compared with surviving models (Multimedia Appendix 1). patients, in-hospital death occurred in older patients with a mean age of 83 (SD 17.8) years (P<.05). The SAPS II score, lactate Statistical Analysis and Model Evaluation dehydrogenase level, and creatinine level of patients in the death The PostgreSQL database system was used to extract the data. group were all significantly higher than those in the surviving Statistical analysis was performed using SPSS 22.0 (IBM Corp), group (P<.05) (Table 1). https://bioinform.jmir.org/2022/1/e38226 JMIR Bioinform Biotech 2022 | vol. 3 | iss. 1 | e38226 | p. 4 (page number not for citation purposes) XSL FO RenderX JMIR BIOINFORMATICS AND BIOTECHNOLOGY Xu et al Table 1. Baseline data of patients in the intensive care unit (ICU) with a femoral neck fracture. Characteristics Patients included (n=366) Survival patients (n=318) Death patients (n=48) P value Male, n (%) 193 (52.7) 172 (54.1) 21 (43.8) .18 Female, n (%) 173 (47.3) 146 (45.9) 27 (56.2) .18 Diabetes, n (%) 67 (18.3) 60 (18.9) 7 (14.6) .47 Hypertension, n (%) 149 (40.7) 130 (40.9) 19 (39.6) .87 Coronary, n (%) 86 (23.5) 70 (22.0) 16 (33.3) .09 2.7 (1.3-4.9) 2.6 (1.4-4.7) 3.0 (1.2-6.1) .94 LOS (h) in ICU (IQR) BMI (IQR) 25.1 (21.0-31.3) 25.6 (21.1-31.5) 23.9 (20.6-28.6) .17 Age (years; IQR) 78.0 (58.0-87.0) 76.5 (57.0-86.0) 83.0 (74.5-90.0) .002 39.0 (27.8-40.0) 36.0 (27.0-45.0) 52.0 (39.5-65.8) <.001 SAPS II score (IQR) Calcium (IQR) 1.092 (1.1-1.1) 1.092 (1.1-1.1) 1.094 (1.1-1.1) .41 Hematocrit (IQR) 22.33 (22.1-22.6) 22.35 (22.1-22.6) 22.25 (22.0-25.1) .41 Hemoglobin (IQR) 7.610 (7.5-7.9) 7.612 (7.5-7.9) 7.579 (7.5-8.4) .38 Lactate (IQR) 2.127 (1.8-2.9) 2.095 (1.8-2.8) 2.678 (2.0-4.7) .001 0.040 (0.0-0.1) 0.041 (0.0-0.1) 0.038 (0.0-0.1) .69 TnT (IQR) CK (IQR) 156.5 (64-584.3) 171.0 (63.7-601.3) 133.0 (77.4-445.5) .60 CKMB (IQR) 5.000 (3.3-12.0) 5.000 (3.3-12.0) 4.925 (3.5-12.6) .69 Vitamin D (IQR) 218.7 (191.1-246.5) 218.7 (191.6-246.0) 216.1 (189.4-252.7) .73 Red blood cell (IQR) 3.435 (3.0-3.9) 3.425 (3.0-3.9) 3.470 (3.0-3.9) .77 White blood cell (IQR) 10.30 (7.4-13.7) 10.25 (7.4-13.7) 11.01 (7.6-14.0) .67 Creatinine (IQR) 0.90 (0.7-1.3) 0.90 (0.7-1.2) 1.25 (0.7-1.6) .01 LOS: length of stay. SAPS II: Simplified Acute Physiology Score II. TnT: troponin T. (Figure 2) were SAPS II, lactate, creatinine, gender, vitamin D, Ranking of the Importance of Characteristic Variables calcium, CK, CKMB, white blood cell, and age. All biochemical The RF model was used to rank the importance of characteristic indices were measured within 2 hours after admission to the variables, and the top 10 variables of characteristic importance ICU. https://bioinform.jmir.org/2022/1/e38226 JMIR Bioinform Biotech 2022 | vol. 3 | iss. 1 | e38226 | p. 5 (page number not for citation purposes) XSL FO RenderX JMIR BIOINFORMATICS AND BIOTECHNOLOGY Xu et al Figure 2. Ranking of important features in the model. CK: creatine kinase; CKMB: creatine kinase isoenzyme; los: length of stay; SAPII: Simplified Acute Physiology Score II; TnT: troponin T. and the AUROCs on the test set were 0.99, 0.95, 0.98, and 0.86, Model Evaluation respectively. Among them, the best results observed for the RF and XGBoost models, and the second-best for the BPNN, but Receiver Operating Characteristic Curve the AUROCs of the machine learning models were all above Three machine learning models and a traditional logistic model 0.95. The prediction results of the 4 prediction models are were constructed on the training set and verified on the test set. analyzed for differences, and the results are shown in Table 2. The 3 machine learning models are RF, BPNN, and XGBoost. The prediction accuracy of the three machine learning models The receiver operating characteristic curves of the 4 prediction on the test set is better than that of the traditional Logistic models were obtained, as shown in Figure 3. The AUROCs of regression model, but the significant difference is not statistically the 4 models on the training set were 1.0, 0.99, 1.00, and 0.85, significant (P>.05). https://bioinform.jmir.org/2022/1/e38226 JMIR Bioinform Biotech 2022 | vol. 3 | iss. 1 | e38226 | p. 6 (page number not for citation purposes) XSL FO RenderX JMIR BIOINFORMATICS AND BIOTECHNOLOGY Xu et al Figure 3. Receiver operating characteristic (ROC) curves of 4 prediction models: (a) random forest; (b) backpropagation neural network; (c) extreme gradient boosting; and (d) logistic regression. Table 2. Significance analysis of the prediction results of 4 models. Prediction models Outcome, n (%) χ² (df) P value In-hospital death Survival 103 (53.93) 88 (46.07) 2.240 (3) .52 RF 104 (54.45) 87 (45.55) 2.240 (3) .52 BPNN 101 (52.88) 90 (47.12) 2.240 (3) .52 XGBoost Logistic regression 91 (47.64) 100 (52.36) 2.240 (3) .52 RF: random forest. BPNN: backpropagation neural network. XGBoost: extreme gradient boosting. 0.96, 0.97, and 0.92, respectively. The F -score of both the Confusion Matrix XGBoost and BPNN was 0.89, but the accuracy, precision, The predictive performance of the 4 models was evaluated using sensitivity, and specificity of XGBoost were higher than those accuracy, precision, sensitivity, specificity, and F -score. The of the BPNN. All 3 machine learning models outperformed the RF model had the best overall prediction with accuracy, traditional logistic regression model (Figure 4) in terms of precision, sensitivity, specificity, and F -scores of 0.96, 0.97, 1 prediction performance (Table 3). https://bioinform.jmir.org/2022/1/e38226 JMIR Bioinform Biotech 2022 | vol. 3 | iss. 1 | e38226 | p. 7 (page number not for citation purposes) XSL FO RenderX JMIR BIOINFORMATICS AND BIOTECHNOLOGY Xu et al Figure 4. Confusion matrices for 4 prediction models; label 1 for the in-hospital death group and label 0 for the survival group: (a) random forest; (b) backpropagation neural network; (c) extreme gradient boosting; and (d) logistic regression. Table 3. The prediction performance evaluation of four models. F -score Prediction model AUROC Accuracy Precision Sensitivity Specificity 1 0.99 0.96 0.97 0.96 0.97 0.92 RF 0.95 0.90 0.90 0.90 0.89 0.89 BPNN 0.98 0.93 0.95 0.92 0.94 0.89 XGBoost Logistic regression 0.86 0.74 0.80 0.70 0.79 0.79 AUROC: area under the receiving operating characteristic curve. RF: random forest. BPNN: backpropagation neural network. XGBoost: extreme gradient boosting. and validation sets, with AUROC of the test set being 0.99, Discussion 0.95, and 0.98, respectively, and with better predictive performance compared to the traditional statistical logistic Principal Findings model. Meanwhile, the RF model was used in this study to rank In this study, 3 high-performing machine learning algorithms the common predictors by calculating the importance of the were selected to develop in-hospital mortality risk prediction feature variables. SAPS II, lactate, creatinine, gender, vitamin models for patients with severe FNFs, including an RF model, D, calcium, CK, CKMB, white blood cell, and age were further a BPNN model, and an XGBoost model. The 3 machine learning identified as significant predictors of death in patients with models exhibited excellent performance on both the training FNFs. https://bioinform.jmir.org/2022/1/e38226 JMIR Bioinform Biotech 2022 | vol. 3 | iss. 1 | e38226 | p. 8 (page number not for citation purposes) XSL FO RenderX JMIR BIOINFORMATICS AND BIOTECHNOLOGY Xu et al score was also significant for predicting mortality in patients. Comparison With Prior Work SAPS II consists of 12 physiological variables, age, type of The logistic model, a traditional statistical prediction model, hospitalization, and 3 types of chronic disease, and the has been more widely used in the prediction of morbidity and measurement of SAPS II daily after admission to the ICU can mortality in FNF [22]. However, logistic regression is more predict the risk of death [31]. However, in existing prediction sensitive to multiple covariance data; it is difficult to deal with studies [32-34], the SAPS II score is commonly used in the problem of data imbalance; the accuracy of the model is prognostic studies of patients with neurological diseases, low; and the ability to fit the true distribution of the data is poor. abdominal infections, and respiratory distress, though there are In recent years, machine learning has been continuously applied fewer studies on the predictive ability of the SAPS II critical to the prediction of disease occurrence and adverse outcomes score in FNFs. The results of this study are important for further in medicine. For example, the risk of acute kidney injury in refining the prediction of morbidity and mortality in patients patients in ICU was predicted using logistic regression, RF, and with FNFs. LightGBM algorithms by Gao [23]. The 3 models predicted the risk of acute kidney injury after 24 hours with increasing Limitations sensitivity, and the model efficacy of the RF and LightGBM This study also has some limitations. First, this was a algorithms was significantly better than that of logistic single-center study based on the MIMIC III database without regression. Huan et al [24] used machine learning to construct external database validation, and the performance of the model models to predict and analyze the risk factors of femoral head needs to be further validated by prospective studies. Second, necrosis after internal fixation in patients with FNF, and the the interpretability of the machine learning model was poor, results proved that there was a good consistency between the and although feature importance ranking was performed, the predicted probability of machine learning and the actual risk of causal relationship between these features and in-hospital necrosis. In this study, the prediction effect of machine learning mortality in patients with FNFs could not be evaluated from a models was compared with that of the traditional logistic statistical perspective. Finally, some imaging metrics could not regression model, and it was confirmed that machine learning be included in the model due to limitations in the available data models had good performance in predicting in-hospital mortality types in the MIMIC III database. Next, we will further integrate of patients with severe FNF, which was consistent with the the existing model with the domestic database to validate the above conclusion. model performance, adjust the parameters to improve the model Meanwhile, the RF model was used in this study to rank the performance, and better adapt the model to the domestic common predictors by calculating the importance of the feature database. Furthermore, we will extend the study timeline to variables. SAPS II, lactate, creatinine, gender, vitamin D, establish a clinically applicable in-hospital mortality risk calcium, CK, CKMB, white blood cell, and age were further prediction model for patients with severe FNFs. identified as significant predictors of death in patients with Conclusions FNFs. In a previous study, Seitz et al [25] found that defective In summary, we used patients’ clinical data to develop 3 bone mineralization and a decrease in 25-hydroxy vitamin D machine learning models for predicting the risk of in-hospital were associated with increased mortality in FNFs. 25-hydroxy death in patients with severe FNFs. The prediction performance vitamin D is the primary form of vitamin D present in the blood. of all 3 machine learning models was better than that of the Vitamin D and serum calcium were important, influential factors traditional logistic model, and the RF model displayed the best affecting in-hospital mortality in patients with FNFs in this prediction performance among the 3 models. In the future, after study, which validated this finding, suggesting that balancing validating the domestic database and adjusting the model serum 25-hydroxy vitamin D levels through calcium parameters, this model can be applied to clinical practice to supplementation and other measures in clinical treatment may better assist clinicians in decision-making, adjust treatment reduce mortality in FNFs. In a prospective controlled study by plans for patients with severe FNFs, better allocate medical Paccou et al [26], lactate dehydrogenase levels and creatinine supplies, and reduce the occurrence of adverse outcomes. levels were significant predictors of bone mineral density Considering that MIMIC is a foreign database with fewer Asian (BMD) loss; this is while BMD was associated with mortality, patients, which is not universal for domestic FNF cases, more and faster BMD loss was associated with a higher risk of death domestic patient data will be included in future work to adjust [27], which is consistent with the results of this study. In the model to make it more compatible with the characteristics addition, compared to previous studies regarding the prediction of the domestic FNF population. of mortality in FNFs [28-30], this study found that the SAPS II Acknowledgments This study was supported by the National Natural Science Foundation of China (grant number 81872718), Shanghai Municipal Health and Family Planning Commission (201840041), and Key undergraduate course project of Shanghai Education Commission (201965). Conflicts of Interest None declared. https://bioinform.jmir.org/2022/1/e38226 JMIR Bioinform Biotech 2022 | vol. 3 | iss. 1 | e38226 | p. 9 (page number not for citation purposes) XSL FO RenderX JMIR BIOINFORMATICS AND BIOTECHNOLOGY Xu et al Multimedia Appendix 1 Paper code. [PDF File (Adobe PDF File), 253 KB-Multimedia Appendix 1] References 1. Zhang Y. Selection strategy and progress on the treatment of femoral neck fractures. Zhongguo Gu Shang 2015;28(9):781-783. [doi: 10.1093/med/9780199550647.003.012051] 2. Thorngren K, Hommel A, Norrman P, Thorngren J, Wingstrand H. Epidemiology of femoral neck fractures. Injury 2002 Dec;33:1-7. [doi: 10.1016/s0020-1383(02)00324-8] 3. Sun X, Zeng R, Hu Z. Femoral head necrosis after treatment of femoral neck fractures with compressive hollow screws. Chin J Orthop Trauma 2012;14(6):477-479. 4. Johnson AE, Pollard TJ, Shen L, Lehman LH, Feng M, Ghassemi M, et al. MIMIC-III, a freely accessible critical care database. Sci Data 2016 May 24;3(1):160035 [FREE Full text] [doi: 10.1038/sdata.2016.35] [Medline: 27219127] 5. Zhang Y. Prediction of mortality in intensive care patients based on machine learning. University of Electronic Science and Technology of China 2018. 6. Anthony Celi L, Mark RG, Stone DJ, Montgomery RA. “Big Data” in the Intensive Care Unit. Closing the Data Loop. Am J Respir Crit Care Med 2013 Jun 01;187(11):1157-1160. [doi: 10.1164/rccm.201212-2311ed] 7. Frost SA, Nguyen ND, Black DA, Eisman JA, Nguyen TV. Risk factors for in-hospital post-hip fracture mortality. Bone 2011 Sep;49(3):553-558. [doi: 10.1016/j.bone.2011.06.002] [Medline: 21689802] 8. Sheikh HQ, Hossain FS, Aqil A, Akinbamijo B, Mushtaq V, Kapoor H. A Comprehensive Analysis of the Causes and Predictors of 30-Day Mortality Following Hip Fracture Surgery. Clin Orthop Surg 2017 Mar;9(1):10-18 [FREE Full text] [doi: 10.4055/cios.2017.9.1.10] [Medline: 28261422] 9. Dhingra M, Goyal T, Yadav A, Choudhury A. One-year mortality rates and factors affecting mortality after surgery for fracture neck of femur in the elderly. J Midlife Health 2021;12(4):276-280 [FREE Full text] [doi: 10.4103/jmh.jmh_208_20] [Medline: 35264833] 10. Tang F, Ishwaran H. Random Forest Missing Data Algorithms. Stat Anal Data Min 2017 Dec 13;10(6):363-377. [doi: 10.1002/sam.11348] [Medline: 29403567] 11. Yin W, Xu Z, Sheng J, Zhang C, Zhu Z. Logistic regression analysis of risk factors for femoral head osteonecrosis after healed intertrochanteric fractures. Hip Int 2016 May 16;26(3):215-219. [doi: 10.5301/hipint.5000346] [Medline: 27013487] 12. Pavlou M, Ambler G, Seaman S, Guttmann O, Elliott P, King M, et al. How to develop a more accurate risk prediction model when there are few events. BMJ 2015 Aug 11;351:h3868 [FREE Full text] [doi: 10.1136/bmj.h3868] [Medline: 26264962] 13. Bevevino A, Dickens J, Potter B, Dworak T, Gordon W, Forsberg J. A model to predict limb salvage in severe combat-related open calcaneus fractures. Clin Orthop Relat Res 2014 Oct;472(10):3002-3009 [FREE Full text] [doi: 10.1007/s11999-013-3382-z] [Medline: 24249536] 14. Breiman L. Random forests. Machine Learning 2001;45:5-32. [doi: 10.1023/A:1010933404324] 15. Khoshgoftaar T, Golawala M, Van HJ. An empirical study of learning from imbalanced data using random forest. 2007 Presented at: 19th IEEE International Conference on Tools with Artificial Intelligence (ICTAI); Jan 04, 2008; Patras, Greece. [doi: 10.1109/ictai.2007.46] 16. Wang L, Zeng Y, Zhang J, Huang W, Bao Y. The criticality of spare parts evaluating model using artificial neural network approach. 2006 Presented at: International Conference on Computational Science; May 28-31, 2006; Reading, UK p. 728-735. [doi: 10.1007/11758501_97] 17. Li H, Li H. Game design of self-automation based on artificial neural nets and genetic algorithms. 2009 Presented at: Second International Conference on Intelligent Computation Technology and Automation; October 10-11, 2009; Changsha, China. [doi: 10.1109/icicta.2009.86] 18. Chen T, Guestrin C. XGBoost: A scalable tree boosting system. ACM Digital Library. 2016. URL: https://dl.acm.org/doi/ pdf/10.1145/2939672.2939785 [accessed 2022-08-12] 19. Ogunleye A, Wang Q. XGBoost Model for Chronic Kidney Disease Diagnosis. IEEE/ACM Trans. Comput. Biol. and Bioinf 2020 Nov 1;17(6):2131-2140. [doi: 10.1109/tcbb.2019.2911071] 20. Wang L, Wang X, Chen A, Jin X, Che H. Prediction of Type 2 Diabetes Risk and Its Effect Evaluation Based on the XGBoost Model. Healthcare (Basel) 2020 Jul 31;8(3):247 [FREE Full text] [doi: 10.3390/healthcare8030247] [Medline: 32751894] 21. Torlay L, Perrone-Bertolotti M, Thomas E, Baciu M. Machine learning-XGBoost analysis of language networks to classify patients with epilepsy. Brain Inform 2017 Sep 22;4(3):159-169 [FREE Full text] [doi: 10.1007/s40708-017-0065-7] [Medline: 28434153] 22. Zheng JQ, Wang H, Gao YS, Ai ZS. Establishment and initial validation of the prediction model for postoperative complications of femoral neck fracture. Journal of TONGJI University (Medical Science) 2020;41(06):739-746. [doi: 10.16118/j.1008-0392.2020.06.010] https://bioinform.jmir.org/2022/1/e38226 JMIR Bioinform Biotech 2022 | vol. 3 | iss. 1 | e38226 | p. 10 (page number not for citation purposes) XSL FO RenderX JMIR BIOINFORMATICS AND BIOTECHNOLOGY Xu et al 23. Gao WP, Lv HJ, Zhou L, Guo SW. Decision tree algorithm applied to MIMIC-lll database for the prediction of acute kidney injury in ICU patients. Beijing Biomedical Engineering 2021;40(06):609-617. [doi: 10.3969/j.issn.1002-3208.2021.06.010] 24. Wang H, Wu W, Han C, Zheng J, Cai X, Chang S, et al. Prediction Model of Osteonecrosis of the Femoral Head After Femoral Neck Fracture: Machine Learning-Based Development and Validation Study. JMIR Med Inform 2021 Nov 19;9(11):e30079 [FREE Full text] [doi: 10.2196/30079] [Medline: 34806984] 25. Seitz S, Koehne T, Ries C, De Novo Oliveira A, Barvencik F, Busse B, et al. Impaired bone mineralization accompanied by low vitamin D and secondary hyperparathyroidism in patients with femoral neck fracture. Osteoporos Int 2013 Feb 12;24(2):641-649. [doi: 10.1007/s00198-012-2011-0] [Medline: 22581296] 26. Paccou J, Merlusca L, Henry-Desailly I, Parcelier A, Gruson B, Royer B, et al. Alterations in bone mineral density and bone turnover markers in newly diagnosed adults with lymphoma receiving chemotherapy: a 1-year prospective pilot study. Ann Oncol 2014 Feb;25(2):481-486 [FREE Full text] [doi: 10.1093/annonc/mdt560] [Medline: 24401926] 27. Marques EA, Elbejjani M, Gudnason V, Sigurdsson G, Lang T, Sigurdsson S, et al. Proximal Femur Volumetric Bone Mineral Density and Mortality: 13 Years of Follow-Up of the AGES-Reykjavik Study. J Bone Miner Res 2017 Jun 20;32(6):1237-1242 [FREE Full text] [doi: 10.1002/jbmr.3104] [Medline: 28276125] 28. Bokshan SL, Marcaccio SE, Blood TD, Hayda RA. Factors influencing survival following hip fracture among octogenarians and nonagenarians in the United States. Injury 2018 Mar;49(3):685-690. [doi: 10.1016/j.injury.2018.02.004] [Medline: 29426609] 29. Fakler JK, Grafe A, Dinger J, Josten C, Aust G. Perioperative risk factors in patients with a femoral neck fracture - influence of 25-hydroxyvitamin D and C-reactive protein on postoperative medical complications and 1-year mortality. BMC Musculoskelet Disord 2016 Feb 01;17(1):51 [FREE Full text] [doi: 10.1186/s12891-016-0906-1] [Medline: 26833068] 30. Sebestyén A, Boncz I, Sándor J, Nyárády J. Effect of surgical delay on early mortality in patients with femoral neck fracture. Int Orthop 2008 Jun 24;32(3):375-379 [FREE Full text] [doi: 10.1007/s00264-007-0331-z] [Medline: 17323093] 31. Le Gall J. A New Simplified Acute Physiology Score (SAPS II) Based on a European/North American Multicenter Study. JAMA 1993 Dec 22;270(24):2957. [doi: 10.1001/jama.1993.03510240069035] 32. Ma LS, Su YY, Li X. Application of simplified acute physiological score II to predict the probability of death in patients with critical neurological diseases. Chinese Journal of Neurology 2010;11:774-777. [doi: 10.3760/cma.j.issn.1006-7876.2010.11.009] 33. Kuang G, Chen Y, Wei XS. The role of 24h LCR, SOFA score and SAPS II score in the prognosis evaluation of sepsis-induced by abdominal infection. J Hunan Normal Univ (Med Sci) 2020;17(01):26-29. 34. Liu H, Xiao J, Hu X, Wang I, Zhou F. The role of simplified acute physiological score-3 in selecting cortisol hormone therapy in patients with moderate to severe acute respiratory distress syndrome. Journal of Capital Medical University 2021;42(06):915-922. [doi: 10.3969/j.issn.1006-7795.2021.06.003] Abbreviations AUROC: area under the receiving operating characteristic curve BMD: bone mineral density BPNN: backpropagation neural network CK: creatine kinase CKMB: creatine kinase isoenzyme FNF: femoral neck fracture ICU: intensive care unit MIMIC: Medical Information Mart for Intensive Care RF: random forest SAPS II: Simplified Acute Physiology Score II SMOTE: synthetic minority oversampling technique XGBoost: extreme gradient boosting Edited by A Mavragani; submitted 24.03.22; peer-reviewed by O Fajarda Oliveira, DZ Pan; comments to author 29.06.22; revised version received 13.07.22; accepted 09.08.22; published 19.08.22 Please cite as: Xu L, Liu J, Han C, Ai Z The Application of Machine Learning in Predicting Mortality Risk in Patients With Severe Femoral Neck Fractures: Prediction Model Development Study JMIR Bioinform Biotech 2022;3(1):e38226 URL: https://bioinform.jmir.org/2022/1/e38226 doi: 10.2196/38226 PMID: https://bioinform.jmir.org/2022/1/e38226 JMIR Bioinform Biotech 2022 | vol. 3 | iss. 1 | e38226 | p. 11 (page number not for citation purposes) XSL FO RenderX JMIR BIOINFORMATICS AND BIOTECHNOLOGY Xu et al ©Lingxiao Xu, Jun Liu, Chunxia Han, Zisheng Ai. Originally published in JMIR Bioinformatics and Biotechnology (https://bioinform.jmir.org), 19.08.2022. This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Bioinformatics and Biotechnology, is properly cited. The complete bibliographic information, a link to the original publication on https://bioinform.jmir.org/, as well as this copyright and license information must be included. https://bioinform.jmir.org/2022/1/e38226 JMIR Bioinform Biotech 2022 | vol. 3 | iss. 1 | e38226 | p. 12 (page number not for citation purposes) XSL FO RenderX http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png JMIR Bioinformatics and Biotechnology JMIR Publications

The Application of Machine Learning in Predicting Mortality Risk in Patients With Severe Femoral Neck Fractures: Prediction Model Development Study

Loading next page...
 
/lp/jmir-publications/the-application-of-machine-learning-in-predicting-mortality-risk-in-HesAnWi0by

References (1)

Publisher
JMIR Publications
Copyright
Copyright © The Author(s). Licensed under Creative Commons Attribution cc-by 4.0
ISSN
2563-3570
DOI
10.2196/38226
Publisher site
See Article on Publisher Site

Abstract

Background: Femoral neck fracture (FNF) accounts for approximately 3.58% of all fractures in the entire body, exhibiting an increasing trend each year. According to a survey, in 1990, the total number of hip fractures in men and women worldwide was approximately 338,000 and 917,000, respectively. In China, FNFs account for 48.22% of hip fractures. Currently, many studies have been conducted on postdischarge mortality and mortality risk in patients with FNF. However, there have been no definitive studies on in-hospital mortality or its influencing factors in patients with severe FNF admitted to the intensive care unit. Objective: In this paper, 3 machine learning methods were used to construct a nosocomial death prediction model for patients admitted to intensive care units to assist clinicians in early clinical decision-making. Methods: A retrospective analysis was conducted using information of a patient with FNF from the Medical Information Mart for Intensive Care III. After balancing the data set using the Synthetic Minority Oversampling Technique algorithm, patients were randomly separated into a 70% training set and a 30% testing set for the development and validation, respectively, of the prediction model. Random forest, extreme gradient boosting, and backpropagation neural network prediction models were constructed with nosocomial death as the outcome. Model performance was assessed using the area under the receiver operating characteristic curve, accuracy, precision, sensitivity, and specificity. The predictive value of the models was verified in comparison to the traditional logistic model. Results: A total of 366 patients with FNFs were selected, including 48 cases (13.1%) of in-hospital death. Data from 636 patients were obtained by balancing the data set with the in-hospital death group to survival group as 1:1. The 3 machine learning models exhibited high predictive accuracy, and the area under the receiver operating characteristic curve of the random forest, extreme gradient boosting, and backpropagation neural network were 0.98, 0.97, and 0.95, respectively, all with higher predictive performance than the traditional logistic regression model. Ranking the importance of the feature variables, the top 10 feature variables that were meaningful for predicting the risk of in-hospital death of patients were the Simplified Acute Physiology Score II, lactate, creatinine, gender, vitamin D, calcium, creatine kinase, creatine kinase isoenzyme, white blood cell, and age. Conclusions: Death risk assessment models constructed using machine learning have positive significance for predicting the in-hospital mortality of patients with severe disease and provide a valid basis for reducing in-hospital mortality and improving patient prognosis. (JMIR Bioinform Biotech 2022;3(1):e38226) doi: 10.2196/38226 KEYWORDS machine learning; femoral neck fracture; hospital mortality; hip; fracture; mortality; prediction; intensive care unit; ICU; decision-making; risk; assessment; prognosis https://bioinform.jmir.org/2022/1/e38226 JMIR Bioinform Biotech 2022 | vol. 3 | iss. 1 | e38226 | p. 1 (page number not for citation purposes) XSL FO RenderX JMIR BIOINFORMATICS AND BIOTECHNOLOGY Xu et al However, there have been no definitive studies on in-hospital Introduction mortality or its influencing factors in such patients with severe FNF admitted to the ICU. Therefore, in this study, we used the Femoral neck fracture (FNF) accounts for approximately 3.58% electronic case information of FNF patients recorded in the of all fractures in the entire body [1], exhibiting an increasing MIMIC database to examine the factors of in-hospital mortality trend each year. According to a survey, in 1990, the total number in patients with FNF using a machine learning model to identify of hip fractures in men and women worldwide was indicators that are meaningful for predicting in-hospital mortality approximately 338,000 and 917,000, respectively [2]. In China, and to provide preventive measures to reduce in-hospital FNFs account for 48.22% of hip fractures [3]. mortality in patients as early as possible. The Medical Information Mart for Intensive Care (MIMIC) III database is a publicly available database commonly used in Methods clinical research [4], which contains medical data on approximately 60,000 patients in the intensive care unit (ICU) Data Source at Beth Israel Deaconess Medical Center from 2001 to 2012. Patient data from MIMIC-III were used for this study, which The ICU database is more dimensional, dense, and valuable in is a database commonly used in critical care big data studies; the field of medicine than the general patient electronic medical it contains clinical information such as demographics, vital record database [5]. The large amount of data recorded from signs, laboratory tests, treatment protocols, and diagnostic codes these treatments and examinations is conducive to the close for 46,520 patients in ICU. observation of ICU patients to detect physiological changes Ethical Considerations associated with deterioration and to provide more valuable data for clinical research [6]. The MIMIC-III database was approved by the Massachusetts Institute of Technology (Cambridge, MA) and Beth Israel Currently, many studies have been conducted on postdischarge Deaconess Medical Center (Boston, MA). The authors have mortality and mortality risk in patients with FNF [7-9]. Sheikh obtained the database download and use right through Protecting et al [8] used backward stepwise likelihood ratio Cox regression Human Research Participants Exam (No. 38335409). Therefore, model to comprehensively analyze the causes of death in patients the ethical approval statement and the need for informed consent with FNF fracture 30 days after surgery, and found that age, were waived for this manuscript. admission hemoglobin, and history of myocardial infarction were important influencing factors to increase mortality. Dhingra Inclusion and Exclusion Criteria et al [9] retrospectively analyzed the influencing factors of In this study, patients admitted to the ICU for FNFs were 1-year postoperative mortality in patients older than 60 years extracted from the MIMIC-III database according to their with FNF, and found that smoking, hypertension, diabetes, low diagnosis codes. The case information included in this study hemoglobin, elevated white blood cell count, and surgical delay was based on the first admission, and data from patients with (>1 week) were significantly associated with higher 1-year the first diagnosis code of FNF, including rotator fracture and postoperative mortality. Frost et al [7] used logistic regression intertrochanteric fracture, were selected according to the order model to determine the risk factors of postoperative nosocomial of diagnosis codes. Patients aged ≤18 years or with ICU length death in patients with FNF and used a nomogram model to of stay <24 hours were excluded, as were patients with grossly predict the risk of death in a short period of time. Studies showed incomplete medical data records (>50% numbers missing). The that age, gender, and complications were the main risk factors case screening process is shown in Figure 1. for nosocomial death in patients with femoral neck fracture. https://bioinform.jmir.org/2022/1/e38226 JMIR Bioinform Biotech 2022 | vol. 3 | iss. 1 | e38226 | p. 2 (page number not for citation purposes) XSL FO RenderX JMIR BIOINFORMATICS AND BIOTECHNOLOGY Xu et al Figure 1. Case screening flowchart. ICU: intensive care unit; MIMIC: Medical Information Mart for Intensive Care. achieve the balanced processing of the data set. The SMOTE Data Collection algorithm is implemented by randomly selecting a sample y Data were collected based on clinical experience, published from their k-nearest neighbors for each sample x in a relatively literature, and data recorded in the MIMIC III database. Data small number of mortality sample sets, and randomly collection for patients with FNFs was performed in the following synthesizing a new mortality sample on the x, y line. A total of 3 main areas: (1) demographic information—sex, age, BMI, 48 samples from the original mortality group were analyzed, length of ICU stay, history of previous illness, and Simplified and then 270 new mortality samples were randomly synthesized Acute Physiology Score II (SAPS II); (2) physiological and and added to the data set to finally obtain a new balanced data biochemical indices within 24 hours after admission to the set (mortality group: survival group = 1:1). ICU—serum calcium, hemoglobin, hematocrit, lactate, cardiac The linear function normalization method was used in this study troponin T level, creatine kinase (CK), creatine kinase to normalize the newly balanced data set. Commonly used isoenzyme (CKMB), vitamin D, red blood cells, white blood methods are linear function normalization (min-max scaling) cells, and creatinine; and (3) outcome—whether in-hospital and 0-mean normalization (z-score standardization). The death occurred after admission to the ICU in patients with normalization process is used to eliminate the computational critical FNFs. errors caused by different data levels and normalize the data to Data Preprocessing the range of 0-1 to ensure that each feature is treated equally by the classifier. The variables included in the study were screened to exclude cases with more than 50% missing values. For cases with no The normalized data set was randomly assigned to the test set more than 50% missing data, random forest (RF) algorithm was and the training set at a ratio of 7:3. Finally, 445 cases were used to impute variables containing missing values sequentially obtained for training the prediction model, and 191 cases were in a loop [10]. The common methods for filling missing data used to verify the predictive performance of the model. are the mean, plurality, median, and fixed value methods, and Model Construction the RF algorithm is a promising method for filling missing data. The missing values are used as new labels, and the model is Currently, logistic regression is one of the commonly used built to obtain predicted values for filling. The RF algorithm methods for identifying risk factors that predict the occurrence for filling in missing data is capable of handling mixed types of complications [11,12]. In an open calcaneal fracture study, of missing data and has the potential to scale up to big data compared to the traditional logistic regression model, machine environments. learning methods have 30% higher accuracy and are more suitable for clinical applications [13]. Since the outcome labels extracted in this study are unbalanced (48/366, 13.1% cases in the death group and 318/366, 86.9% RF is an integrated learning algorithm consisting of multiple cases in the survival group), the prediction results of the model decision trees formed by randomly adding back resampled trained by the machine learning algorithm are prone to bias for samples, which is suitable for problems where the number of the unbalanced data set; therefore, the original data set needs samples is much smaller than the number of features [14]. It to be balanced. In this study, the synthetic minority also has the advantages of robust effect, fast learning speed, oversampling technique (SMOTE) function in the “imblearn” strong generalization ability, and good classification library of Python (Python Software Foundation) is used to performance for missing data and imbalanced data [15]. https://bioinform.jmir.org/2022/1/e38226 JMIR Bioinform Biotech 2022 | vol. 3 | iss. 1 | e38226 | p. 3 (page number not for citation purposes) XSL FO RenderX JMIR BIOINFORMATICS AND BIOTECHNOLOGY Xu et al Backpropagation neural network (BPNN) is a feed-forward, and data cleaning, model construction, and performance and the most widely used, neural network [16]. The algorithm evaluation were performed using Python 3.8. All continuous has high self-learning and self-adaptive ability, strong variables are expressed as medians (quartiles), and count data generalization ability, and good prediction performance for are expressed as the number of cases (percentages). The untrained data. At the same time, the BPNN has high fault Mann-Whitney U test was used for univariate analysis of tolerance; that is, even if the system is damaged locally, it can continuous variables, and Fisher exact test was used for still work normally [17]. univariate analysis of categorical variables. The Pearson test was used for the analysis of variance of the machine learning Extreme gradient boosting (XGBoost) algorithm is a mainstream model results. P<.05 was considered to be a statistically machine learning algorithm based on tree model boosting [18]. significant difference. It continuously updates the error or residual of the model by adding tree models and then adjusts the weight of the The model evaluation indices were the area under the receiver misclassification results so that the model can select samples operating characteristic curve (AUROC), accuracy, precision, more intelligently and reduce the errors generated by the model. sensitivity, specificity, and F -score. The XGBoost algorithm has been widely used in clinical studies for predicting the occurrence of diseases and predicting adverse Results patient outcomes and has been shown to be more effective than other machine learning models in several studies [19-21]. Basic Characteristics of Patients With Severe FNFs Therefore, in this study, 3 algorithms, namely RF, BPNN, and A total of 366 eligible patients with FNF with a mean age of XGBoost, were used to construct machine learning prediction 78 (SD 20.4) years were screened. Compared with surviving models (Multimedia Appendix 1). patients, in-hospital death occurred in older patients with a mean age of 83 (SD 17.8) years (P<.05). The SAPS II score, lactate Statistical Analysis and Model Evaluation dehydrogenase level, and creatinine level of patients in the death The PostgreSQL database system was used to extract the data. group were all significantly higher than those in the surviving Statistical analysis was performed using SPSS 22.0 (IBM Corp), group (P<.05) (Table 1). https://bioinform.jmir.org/2022/1/e38226 JMIR Bioinform Biotech 2022 | vol. 3 | iss. 1 | e38226 | p. 4 (page number not for citation purposes) XSL FO RenderX JMIR BIOINFORMATICS AND BIOTECHNOLOGY Xu et al Table 1. Baseline data of patients in the intensive care unit (ICU) with a femoral neck fracture. Characteristics Patients included (n=366) Survival patients (n=318) Death patients (n=48) P value Male, n (%) 193 (52.7) 172 (54.1) 21 (43.8) .18 Female, n (%) 173 (47.3) 146 (45.9) 27 (56.2) .18 Diabetes, n (%) 67 (18.3) 60 (18.9) 7 (14.6) .47 Hypertension, n (%) 149 (40.7) 130 (40.9) 19 (39.6) .87 Coronary, n (%) 86 (23.5) 70 (22.0) 16 (33.3) .09 2.7 (1.3-4.9) 2.6 (1.4-4.7) 3.0 (1.2-6.1) .94 LOS (h) in ICU (IQR) BMI (IQR) 25.1 (21.0-31.3) 25.6 (21.1-31.5) 23.9 (20.6-28.6) .17 Age (years; IQR) 78.0 (58.0-87.0) 76.5 (57.0-86.0) 83.0 (74.5-90.0) .002 39.0 (27.8-40.0) 36.0 (27.0-45.0) 52.0 (39.5-65.8) <.001 SAPS II score (IQR) Calcium (IQR) 1.092 (1.1-1.1) 1.092 (1.1-1.1) 1.094 (1.1-1.1) .41 Hematocrit (IQR) 22.33 (22.1-22.6) 22.35 (22.1-22.6) 22.25 (22.0-25.1) .41 Hemoglobin (IQR) 7.610 (7.5-7.9) 7.612 (7.5-7.9) 7.579 (7.5-8.4) .38 Lactate (IQR) 2.127 (1.8-2.9) 2.095 (1.8-2.8) 2.678 (2.0-4.7) .001 0.040 (0.0-0.1) 0.041 (0.0-0.1) 0.038 (0.0-0.1) .69 TnT (IQR) CK (IQR) 156.5 (64-584.3) 171.0 (63.7-601.3) 133.0 (77.4-445.5) .60 CKMB (IQR) 5.000 (3.3-12.0) 5.000 (3.3-12.0) 4.925 (3.5-12.6) .69 Vitamin D (IQR) 218.7 (191.1-246.5) 218.7 (191.6-246.0) 216.1 (189.4-252.7) .73 Red blood cell (IQR) 3.435 (3.0-3.9) 3.425 (3.0-3.9) 3.470 (3.0-3.9) .77 White blood cell (IQR) 10.30 (7.4-13.7) 10.25 (7.4-13.7) 11.01 (7.6-14.0) .67 Creatinine (IQR) 0.90 (0.7-1.3) 0.90 (0.7-1.2) 1.25 (0.7-1.6) .01 LOS: length of stay. SAPS II: Simplified Acute Physiology Score II. TnT: troponin T. (Figure 2) were SAPS II, lactate, creatinine, gender, vitamin D, Ranking of the Importance of Characteristic Variables calcium, CK, CKMB, white blood cell, and age. All biochemical The RF model was used to rank the importance of characteristic indices were measured within 2 hours after admission to the variables, and the top 10 variables of characteristic importance ICU. https://bioinform.jmir.org/2022/1/e38226 JMIR Bioinform Biotech 2022 | vol. 3 | iss. 1 | e38226 | p. 5 (page number not for citation purposes) XSL FO RenderX JMIR BIOINFORMATICS AND BIOTECHNOLOGY Xu et al Figure 2. Ranking of important features in the model. CK: creatine kinase; CKMB: creatine kinase isoenzyme; los: length of stay; SAPII: Simplified Acute Physiology Score II; TnT: troponin T. and the AUROCs on the test set were 0.99, 0.95, 0.98, and 0.86, Model Evaluation respectively. Among them, the best results observed for the RF and XGBoost models, and the second-best for the BPNN, but Receiver Operating Characteristic Curve the AUROCs of the machine learning models were all above Three machine learning models and a traditional logistic model 0.95. The prediction results of the 4 prediction models are were constructed on the training set and verified on the test set. analyzed for differences, and the results are shown in Table 2. The 3 machine learning models are RF, BPNN, and XGBoost. The prediction accuracy of the three machine learning models The receiver operating characteristic curves of the 4 prediction on the test set is better than that of the traditional Logistic models were obtained, as shown in Figure 3. The AUROCs of regression model, but the significant difference is not statistically the 4 models on the training set were 1.0, 0.99, 1.00, and 0.85, significant (P>.05). https://bioinform.jmir.org/2022/1/e38226 JMIR Bioinform Biotech 2022 | vol. 3 | iss. 1 | e38226 | p. 6 (page number not for citation purposes) XSL FO RenderX JMIR BIOINFORMATICS AND BIOTECHNOLOGY Xu et al Figure 3. Receiver operating characteristic (ROC) curves of 4 prediction models: (a) random forest; (b) backpropagation neural network; (c) extreme gradient boosting; and (d) logistic regression. Table 2. Significance analysis of the prediction results of 4 models. Prediction models Outcome, n (%) χ² (df) P value In-hospital death Survival 103 (53.93) 88 (46.07) 2.240 (3) .52 RF 104 (54.45) 87 (45.55) 2.240 (3) .52 BPNN 101 (52.88) 90 (47.12) 2.240 (3) .52 XGBoost Logistic regression 91 (47.64) 100 (52.36) 2.240 (3) .52 RF: random forest. BPNN: backpropagation neural network. XGBoost: extreme gradient boosting. 0.96, 0.97, and 0.92, respectively. The F -score of both the Confusion Matrix XGBoost and BPNN was 0.89, but the accuracy, precision, The predictive performance of the 4 models was evaluated using sensitivity, and specificity of XGBoost were higher than those accuracy, precision, sensitivity, specificity, and F -score. The of the BPNN. All 3 machine learning models outperformed the RF model had the best overall prediction with accuracy, traditional logistic regression model (Figure 4) in terms of precision, sensitivity, specificity, and F -scores of 0.96, 0.97, 1 prediction performance (Table 3). https://bioinform.jmir.org/2022/1/e38226 JMIR Bioinform Biotech 2022 | vol. 3 | iss. 1 | e38226 | p. 7 (page number not for citation purposes) XSL FO RenderX JMIR BIOINFORMATICS AND BIOTECHNOLOGY Xu et al Figure 4. Confusion matrices for 4 prediction models; label 1 for the in-hospital death group and label 0 for the survival group: (a) random forest; (b) backpropagation neural network; (c) extreme gradient boosting; and (d) logistic regression. Table 3. The prediction performance evaluation of four models. F -score Prediction model AUROC Accuracy Precision Sensitivity Specificity 1 0.99 0.96 0.97 0.96 0.97 0.92 RF 0.95 0.90 0.90 0.90 0.89 0.89 BPNN 0.98 0.93 0.95 0.92 0.94 0.89 XGBoost Logistic regression 0.86 0.74 0.80 0.70 0.79 0.79 AUROC: area under the receiving operating characteristic curve. RF: random forest. BPNN: backpropagation neural network. XGBoost: extreme gradient boosting. and validation sets, with AUROC of the test set being 0.99, Discussion 0.95, and 0.98, respectively, and with better predictive performance compared to the traditional statistical logistic Principal Findings model. Meanwhile, the RF model was used in this study to rank In this study, 3 high-performing machine learning algorithms the common predictors by calculating the importance of the were selected to develop in-hospital mortality risk prediction feature variables. SAPS II, lactate, creatinine, gender, vitamin models for patients with severe FNFs, including an RF model, D, calcium, CK, CKMB, white blood cell, and age were further a BPNN model, and an XGBoost model. The 3 machine learning identified as significant predictors of death in patients with models exhibited excellent performance on both the training FNFs. https://bioinform.jmir.org/2022/1/e38226 JMIR Bioinform Biotech 2022 | vol. 3 | iss. 1 | e38226 | p. 8 (page number not for citation purposes) XSL FO RenderX JMIR BIOINFORMATICS AND BIOTECHNOLOGY Xu et al score was also significant for predicting mortality in patients. Comparison With Prior Work SAPS II consists of 12 physiological variables, age, type of The logistic model, a traditional statistical prediction model, hospitalization, and 3 types of chronic disease, and the has been more widely used in the prediction of morbidity and measurement of SAPS II daily after admission to the ICU can mortality in FNF [22]. However, logistic regression is more predict the risk of death [31]. However, in existing prediction sensitive to multiple covariance data; it is difficult to deal with studies [32-34], the SAPS II score is commonly used in the problem of data imbalance; the accuracy of the model is prognostic studies of patients with neurological diseases, low; and the ability to fit the true distribution of the data is poor. abdominal infections, and respiratory distress, though there are In recent years, machine learning has been continuously applied fewer studies on the predictive ability of the SAPS II critical to the prediction of disease occurrence and adverse outcomes score in FNFs. The results of this study are important for further in medicine. For example, the risk of acute kidney injury in refining the prediction of morbidity and mortality in patients patients in ICU was predicted using logistic regression, RF, and with FNFs. LightGBM algorithms by Gao [23]. The 3 models predicted the risk of acute kidney injury after 24 hours with increasing Limitations sensitivity, and the model efficacy of the RF and LightGBM This study also has some limitations. First, this was a algorithms was significantly better than that of logistic single-center study based on the MIMIC III database without regression. Huan et al [24] used machine learning to construct external database validation, and the performance of the model models to predict and analyze the risk factors of femoral head needs to be further validated by prospective studies. Second, necrosis after internal fixation in patients with FNF, and the the interpretability of the machine learning model was poor, results proved that there was a good consistency between the and although feature importance ranking was performed, the predicted probability of machine learning and the actual risk of causal relationship between these features and in-hospital necrosis. In this study, the prediction effect of machine learning mortality in patients with FNFs could not be evaluated from a models was compared with that of the traditional logistic statistical perspective. Finally, some imaging metrics could not regression model, and it was confirmed that machine learning be included in the model due to limitations in the available data models had good performance in predicting in-hospital mortality types in the MIMIC III database. Next, we will further integrate of patients with severe FNF, which was consistent with the the existing model with the domestic database to validate the above conclusion. model performance, adjust the parameters to improve the model Meanwhile, the RF model was used in this study to rank the performance, and better adapt the model to the domestic common predictors by calculating the importance of the feature database. Furthermore, we will extend the study timeline to variables. SAPS II, lactate, creatinine, gender, vitamin D, establish a clinically applicable in-hospital mortality risk calcium, CK, CKMB, white blood cell, and age were further prediction model for patients with severe FNFs. identified as significant predictors of death in patients with Conclusions FNFs. In a previous study, Seitz et al [25] found that defective In summary, we used patients’ clinical data to develop 3 bone mineralization and a decrease in 25-hydroxy vitamin D machine learning models for predicting the risk of in-hospital were associated with increased mortality in FNFs. 25-hydroxy death in patients with severe FNFs. The prediction performance vitamin D is the primary form of vitamin D present in the blood. of all 3 machine learning models was better than that of the Vitamin D and serum calcium were important, influential factors traditional logistic model, and the RF model displayed the best affecting in-hospital mortality in patients with FNFs in this prediction performance among the 3 models. In the future, after study, which validated this finding, suggesting that balancing validating the domestic database and adjusting the model serum 25-hydroxy vitamin D levels through calcium parameters, this model can be applied to clinical practice to supplementation and other measures in clinical treatment may better assist clinicians in decision-making, adjust treatment reduce mortality in FNFs. In a prospective controlled study by plans for patients with severe FNFs, better allocate medical Paccou et al [26], lactate dehydrogenase levels and creatinine supplies, and reduce the occurrence of adverse outcomes. levels were significant predictors of bone mineral density Considering that MIMIC is a foreign database with fewer Asian (BMD) loss; this is while BMD was associated with mortality, patients, which is not universal for domestic FNF cases, more and faster BMD loss was associated with a higher risk of death domestic patient data will be included in future work to adjust [27], which is consistent with the results of this study. In the model to make it more compatible with the characteristics addition, compared to previous studies regarding the prediction of the domestic FNF population. of mortality in FNFs [28-30], this study found that the SAPS II Acknowledgments This study was supported by the National Natural Science Foundation of China (grant number 81872718), Shanghai Municipal Health and Family Planning Commission (201840041), and Key undergraduate course project of Shanghai Education Commission (201965). Conflicts of Interest None declared. https://bioinform.jmir.org/2022/1/e38226 JMIR Bioinform Biotech 2022 | vol. 3 | iss. 1 | e38226 | p. 9 (page number not for citation purposes) XSL FO RenderX JMIR BIOINFORMATICS AND BIOTECHNOLOGY Xu et al Multimedia Appendix 1 Paper code. [PDF File (Adobe PDF File), 253 KB-Multimedia Appendix 1] References 1. Zhang Y. Selection strategy and progress on the treatment of femoral neck fractures. Zhongguo Gu Shang 2015;28(9):781-783. [doi: 10.1093/med/9780199550647.003.012051] 2. Thorngren K, Hommel A, Norrman P, Thorngren J, Wingstrand H. Epidemiology of femoral neck fractures. Injury 2002 Dec;33:1-7. [doi: 10.1016/s0020-1383(02)00324-8] 3. Sun X, Zeng R, Hu Z. Femoral head necrosis after treatment of femoral neck fractures with compressive hollow screws. Chin J Orthop Trauma 2012;14(6):477-479. 4. Johnson AE, Pollard TJ, Shen L, Lehman LH, Feng M, Ghassemi M, et al. MIMIC-III, a freely accessible critical care database. Sci Data 2016 May 24;3(1):160035 [FREE Full text] [doi: 10.1038/sdata.2016.35] [Medline: 27219127] 5. Zhang Y. Prediction of mortality in intensive care patients based on machine learning. University of Electronic Science and Technology of China 2018. 6. Anthony Celi L, Mark RG, Stone DJ, Montgomery RA. “Big Data” in the Intensive Care Unit. Closing the Data Loop. Am J Respir Crit Care Med 2013 Jun 01;187(11):1157-1160. [doi: 10.1164/rccm.201212-2311ed] 7. Frost SA, Nguyen ND, Black DA, Eisman JA, Nguyen TV. Risk factors for in-hospital post-hip fracture mortality. Bone 2011 Sep;49(3):553-558. [doi: 10.1016/j.bone.2011.06.002] [Medline: 21689802] 8. Sheikh HQ, Hossain FS, Aqil A, Akinbamijo B, Mushtaq V, Kapoor H. A Comprehensive Analysis of the Causes and Predictors of 30-Day Mortality Following Hip Fracture Surgery. Clin Orthop Surg 2017 Mar;9(1):10-18 [FREE Full text] [doi: 10.4055/cios.2017.9.1.10] [Medline: 28261422] 9. Dhingra M, Goyal T, Yadav A, Choudhury A. One-year mortality rates and factors affecting mortality after surgery for fracture neck of femur in the elderly. J Midlife Health 2021;12(4):276-280 [FREE Full text] [doi: 10.4103/jmh.jmh_208_20] [Medline: 35264833] 10. Tang F, Ishwaran H. Random Forest Missing Data Algorithms. Stat Anal Data Min 2017 Dec 13;10(6):363-377. [doi: 10.1002/sam.11348] [Medline: 29403567] 11. Yin W, Xu Z, Sheng J, Zhang C, Zhu Z. Logistic regression analysis of risk factors for femoral head osteonecrosis after healed intertrochanteric fractures. Hip Int 2016 May 16;26(3):215-219. [doi: 10.5301/hipint.5000346] [Medline: 27013487] 12. Pavlou M, Ambler G, Seaman S, Guttmann O, Elliott P, King M, et al. How to develop a more accurate risk prediction model when there are few events. BMJ 2015 Aug 11;351:h3868 [FREE Full text] [doi: 10.1136/bmj.h3868] [Medline: 26264962] 13. Bevevino A, Dickens J, Potter B, Dworak T, Gordon W, Forsberg J. A model to predict limb salvage in severe combat-related open calcaneus fractures. Clin Orthop Relat Res 2014 Oct;472(10):3002-3009 [FREE Full text] [doi: 10.1007/s11999-013-3382-z] [Medline: 24249536] 14. Breiman L. Random forests. Machine Learning 2001;45:5-32. [doi: 10.1023/A:1010933404324] 15. Khoshgoftaar T, Golawala M, Van HJ. An empirical study of learning from imbalanced data using random forest. 2007 Presented at: 19th IEEE International Conference on Tools with Artificial Intelligence (ICTAI); Jan 04, 2008; Patras, Greece. [doi: 10.1109/ictai.2007.46] 16. Wang L, Zeng Y, Zhang J, Huang W, Bao Y. The criticality of spare parts evaluating model using artificial neural network approach. 2006 Presented at: International Conference on Computational Science; May 28-31, 2006; Reading, UK p. 728-735. [doi: 10.1007/11758501_97] 17. Li H, Li H. Game design of self-automation based on artificial neural nets and genetic algorithms. 2009 Presented at: Second International Conference on Intelligent Computation Technology and Automation; October 10-11, 2009; Changsha, China. [doi: 10.1109/icicta.2009.86] 18. Chen T, Guestrin C. XGBoost: A scalable tree boosting system. ACM Digital Library. 2016. URL: https://dl.acm.org/doi/ pdf/10.1145/2939672.2939785 [accessed 2022-08-12] 19. Ogunleye A, Wang Q. XGBoost Model for Chronic Kidney Disease Diagnosis. IEEE/ACM Trans. Comput. Biol. and Bioinf 2020 Nov 1;17(6):2131-2140. [doi: 10.1109/tcbb.2019.2911071] 20. Wang L, Wang X, Chen A, Jin X, Che H. Prediction of Type 2 Diabetes Risk and Its Effect Evaluation Based on the XGBoost Model. Healthcare (Basel) 2020 Jul 31;8(3):247 [FREE Full text] [doi: 10.3390/healthcare8030247] [Medline: 32751894] 21. Torlay L, Perrone-Bertolotti M, Thomas E, Baciu M. Machine learning-XGBoost analysis of language networks to classify patients with epilepsy. Brain Inform 2017 Sep 22;4(3):159-169 [FREE Full text] [doi: 10.1007/s40708-017-0065-7] [Medline: 28434153] 22. Zheng JQ, Wang H, Gao YS, Ai ZS. Establishment and initial validation of the prediction model for postoperative complications of femoral neck fracture. Journal of TONGJI University (Medical Science) 2020;41(06):739-746. [doi: 10.16118/j.1008-0392.2020.06.010] https://bioinform.jmir.org/2022/1/e38226 JMIR Bioinform Biotech 2022 | vol. 3 | iss. 1 | e38226 | p. 10 (page number not for citation purposes) XSL FO RenderX JMIR BIOINFORMATICS AND BIOTECHNOLOGY Xu et al 23. Gao WP, Lv HJ, Zhou L, Guo SW. Decision tree algorithm applied to MIMIC-lll database for the prediction of acute kidney injury in ICU patients. Beijing Biomedical Engineering 2021;40(06):609-617. [doi: 10.3969/j.issn.1002-3208.2021.06.010] 24. Wang H, Wu W, Han C, Zheng J, Cai X, Chang S, et al. Prediction Model of Osteonecrosis of the Femoral Head After Femoral Neck Fracture: Machine Learning-Based Development and Validation Study. JMIR Med Inform 2021 Nov 19;9(11):e30079 [FREE Full text] [doi: 10.2196/30079] [Medline: 34806984] 25. Seitz S, Koehne T, Ries C, De Novo Oliveira A, Barvencik F, Busse B, et al. Impaired bone mineralization accompanied by low vitamin D and secondary hyperparathyroidism in patients with femoral neck fracture. Osteoporos Int 2013 Feb 12;24(2):641-649. [doi: 10.1007/s00198-012-2011-0] [Medline: 22581296] 26. Paccou J, Merlusca L, Henry-Desailly I, Parcelier A, Gruson B, Royer B, et al. Alterations in bone mineral density and bone turnover markers in newly diagnosed adults with lymphoma receiving chemotherapy: a 1-year prospective pilot study. Ann Oncol 2014 Feb;25(2):481-486 [FREE Full text] [doi: 10.1093/annonc/mdt560] [Medline: 24401926] 27. Marques EA, Elbejjani M, Gudnason V, Sigurdsson G, Lang T, Sigurdsson S, et al. Proximal Femur Volumetric Bone Mineral Density and Mortality: 13 Years of Follow-Up of the AGES-Reykjavik Study. J Bone Miner Res 2017 Jun 20;32(6):1237-1242 [FREE Full text] [doi: 10.1002/jbmr.3104] [Medline: 28276125] 28. Bokshan SL, Marcaccio SE, Blood TD, Hayda RA. Factors influencing survival following hip fracture among octogenarians and nonagenarians in the United States. Injury 2018 Mar;49(3):685-690. [doi: 10.1016/j.injury.2018.02.004] [Medline: 29426609] 29. Fakler JK, Grafe A, Dinger J, Josten C, Aust G. Perioperative risk factors in patients with a femoral neck fracture - influence of 25-hydroxyvitamin D and C-reactive protein on postoperative medical complications and 1-year mortality. BMC Musculoskelet Disord 2016 Feb 01;17(1):51 [FREE Full text] [doi: 10.1186/s12891-016-0906-1] [Medline: 26833068] 30. Sebestyén A, Boncz I, Sándor J, Nyárády J. Effect of surgical delay on early mortality in patients with femoral neck fracture. Int Orthop 2008 Jun 24;32(3):375-379 [FREE Full text] [doi: 10.1007/s00264-007-0331-z] [Medline: 17323093] 31. Le Gall J. A New Simplified Acute Physiology Score (SAPS II) Based on a European/North American Multicenter Study. JAMA 1993 Dec 22;270(24):2957. [doi: 10.1001/jama.1993.03510240069035] 32. Ma LS, Su YY, Li X. Application of simplified acute physiological score II to predict the probability of death in patients with critical neurological diseases. Chinese Journal of Neurology 2010;11:774-777. [doi: 10.3760/cma.j.issn.1006-7876.2010.11.009] 33. Kuang G, Chen Y, Wei XS. The role of 24h LCR, SOFA score and SAPS II score in the prognosis evaluation of sepsis-induced by abdominal infection. J Hunan Normal Univ (Med Sci) 2020;17(01):26-29. 34. Liu H, Xiao J, Hu X, Wang I, Zhou F. The role of simplified acute physiological score-3 in selecting cortisol hormone therapy in patients with moderate to severe acute respiratory distress syndrome. Journal of Capital Medical University 2021;42(06):915-922. [doi: 10.3969/j.issn.1006-7795.2021.06.003] Abbreviations AUROC: area under the receiving operating characteristic curve BMD: bone mineral density BPNN: backpropagation neural network CK: creatine kinase CKMB: creatine kinase isoenzyme FNF: femoral neck fracture ICU: intensive care unit MIMIC: Medical Information Mart for Intensive Care RF: random forest SAPS II: Simplified Acute Physiology Score II SMOTE: synthetic minority oversampling technique XGBoost: extreme gradient boosting Edited by A Mavragani; submitted 24.03.22; peer-reviewed by O Fajarda Oliveira, DZ Pan; comments to author 29.06.22; revised version received 13.07.22; accepted 09.08.22; published 19.08.22 Please cite as: Xu L, Liu J, Han C, Ai Z The Application of Machine Learning in Predicting Mortality Risk in Patients With Severe Femoral Neck Fractures: Prediction Model Development Study JMIR Bioinform Biotech 2022;3(1):e38226 URL: https://bioinform.jmir.org/2022/1/e38226 doi: 10.2196/38226 PMID: https://bioinform.jmir.org/2022/1/e38226 JMIR Bioinform Biotech 2022 | vol. 3 | iss. 1 | e38226 | p. 11 (page number not for citation purposes) XSL FO RenderX JMIR BIOINFORMATICS AND BIOTECHNOLOGY Xu et al ©Lingxiao Xu, Jun Liu, Chunxia Han, Zisheng Ai. Originally published in JMIR Bioinformatics and Biotechnology (https://bioinform.jmir.org), 19.08.2022. This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Bioinformatics and Biotechnology, is properly cited. The complete bibliographic information, a link to the original publication on https://bioinform.jmir.org/, as well as this copyright and license information must be included. https://bioinform.jmir.org/2022/1/e38226 JMIR Bioinform Biotech 2022 | vol. 3 | iss. 1 | e38226 | p. 12 (page number not for citation purposes) XSL FO RenderX

Journal

JMIR Bioinformatics and BiotechnologyJMIR Publications

Published: Aug 19, 2022

Keywords: machine learning; femoral neck fracture; hospital mortality; hip; fracture; mortality; prediction; intensive care unit; ICU; decision-making; risk; assessment; prognosis

There are no references for this article.