Background The COVID-19 pandemic began in early 2021 and placed significant strains on health care systems worldwide. There remains a compelling need to analyze factors that are predictive for patients at elevated risk of morbidity and mortality. Objective The goal of this retrospective study of patients who tested positive with COVID-19 and were treated at NYU (New York University) Langone Health was to identify clinical markers predictive of disease severity in order to assist in clinical decision triage and to provide additional biological insights into disease progression. Methods The clinical activity of 3740 patients at NYU Langone Hospital was obtained between January and August 2020; patient data were deidentified. Models were trained on clinical data during different parts of their hospital stay to predict three clinical outcomes: deceased, ventilated, or admitted to the intensive care unit (ICU). Results The XGBoost (eXtreme Gradient Boosting) model that was trained on clinical data from the final 24 hours excelled at predicting mortality (area under the curve [AUC]=0.92; specificity=86%; and sensitivity=85%). Respiration rate was the most important feature, followed by SpO 2 (peripheral oxygen saturation) and being aged 75 years and over. Performance of this model to predict the deceased outcome extended 5 days prior, with AUC=0.81, specificity=70%, and sensitivity=75%. When only using clinical data from the first 24 hours, AUCs of 0.79, 0.80, and 0.77 were obtained for deceased, ventilated, or ICU-admitted outcomes, respectively. Although respiration rate and SpO 2 levels offered the highest feature importance, other canonical markers, including diabetic history, age, and temperature, offered minimal gain. When lab values were incorporated, prediction of mortality benefited the most from blood urea nitrogen and lactate dehydrogenase (LDH). Features that were predictive of morbidity included LDH, calcium, glucose, and C-reactive protein. Conclusions Together, this work summarizes efforts to systematically examine the importance of a wide range of features across different endpoint outcomes and at different hospitalization time points.
【저자키워드】 COVID-19, SARS-CoV-2, coronavirus, Mortality, Decision making, severity, machine learning, hospital, prediction, Symptom, outcome, New York City, morbidity, Model, Predictive modeling, marker, 【초록키워드】 intensive care, Hospitalization, COVID-19 pandemic, disease severity, LDH, C-reactive protein, risk, lactate dehydrogenase, ICU, outcomes, Disease progression, Retrospective study, Patient, age, temperature, morbidity and mortality, performance, university, New York, Health care system, predict, Glucose, Deceased, strain, AUC, Hospital stay, Predictive, 24 hours, patient data, Blood urea nitrogen, Diabetic, Factor, Ventilated, Endpoint, Clinical data, peripheral oxygen saturation, effort, clinical decision, respiration rate, Final, positive, clinical activity, canonical markers, objective, feature, Gradient, Result, highest, tested, identify, elevated, treated, assist, was obtained, 24 hour, clinical marker, lab value, offered, with COVID-19, 【제목키워드】 clinical, modeling, development,