TY - JOUR
T1 - Machine learning models for the prediction of the SEIRD variables for the COVID-19 pandemic based on a deep dependence analysis of variables
AU - Quintero, Yullis
AU - Ardila, Douglas
AU - Camargo, Edgar
AU - Rivas, Francklin
AU - Aguilar, Jose
N1 - Publisher Copyright:
© 2021 Elsevier Ltd
PY - 2021/7
Y1 - 2021/7
N2 - The SEIRD (Susceptible, Exposed, Infected, Recovered, and Dead) model is a mathematical model based on dynamic equations; widely used for characterization of the COVID-19 pandemic. In this paper, a different approach has been discussed, which is the development of predictive models for the SEIRD variables that have been based on the historical data collected, and the context variables to where this model has been applied to. Particularly, the context variables examined in this paper include total population, number of people over 65 years old, poverty index, morbidity rates, average age, and population density. For the construction of the SEIRD predictive models, this study encompasses a deep analysis of the dependence of these variables and also, their relationship with the context variables. Hence, before the development of predictive models using machine learning techniques, a methodology to analyze the interdependence of the SEIRD variables has been proposed. The dependence with the context variables is also discussed; to avoid the curse of dimensionality and multicollinearity problems, leading to better results and the reduction of the computational cost. Finally, several prediction models based on varied machine learning techniques and inputs are considered, these include temporal interdependence, temporal intra-dependence, and dependence with context variables. Each of the predictive models has been studied, as well as their quality of prediction. This paper focuses on the analysis of the quality of this approach, applied in Colombia, obtaining the results about the performance of the predictive models for the SEIRD variables. The results are very encouraging since the values obtained with the quality metrics are quite good for different prediction horizons.
AB - The SEIRD (Susceptible, Exposed, Infected, Recovered, and Dead) model is a mathematical model based on dynamic equations; widely used for characterization of the COVID-19 pandemic. In this paper, a different approach has been discussed, which is the development of predictive models for the SEIRD variables that have been based on the historical data collected, and the context variables to where this model has been applied to. Particularly, the context variables examined in this paper include total population, number of people over 65 years old, poverty index, morbidity rates, average age, and population density. For the construction of the SEIRD predictive models, this study encompasses a deep analysis of the dependence of these variables and also, their relationship with the context variables. Hence, before the development of predictive models using machine learning techniques, a methodology to analyze the interdependence of the SEIRD variables has been proposed. The dependence with the context variables is also discussed; to avoid the curse of dimensionality and multicollinearity problems, leading to better results and the reduction of the computational cost. Finally, several prediction models based on varied machine learning techniques and inputs are considered, these include temporal interdependence, temporal intra-dependence, and dependence with context variables. Each of the predictive models has been studied, as well as their quality of prediction. This paper focuses on the analysis of the quality of this approach, applied in Colombia, obtaining the results about the performance of the predictive models for the SEIRD variables. The results are very encouraging since the values obtained with the quality metrics are quite good for different prediction horizons.
KW - COVID-19
KW - Data dependence analysis
KW - Machine learning
KW - Prediction model
UR - http://www.scopus.com/inward/record.url?scp=85106639893&partnerID=8YFLogxK
U2 - 10.1016/j.compbiomed.2021.104500
DO - 10.1016/j.compbiomed.2021.104500
M3 - Article
C2 - 34052570
AN - SCOPUS:85106639893
SN - 0010-4825
VL - 134
JO - Computers in Biology and Medicine
JF - Computers in Biology and Medicine
M1 - 104500
ER -