Machine learning models for the prediction of the SEIRD variables for the COVID-19 pandemic based on a deep dependence analysis of variables

Yullis Quintero, Douglas Ardila, Edgar Camargo, Francklin Rivas, Jose Aguilar*

*Autor correspondiente de este trabajo

Producción científica: Contribución a una revistaArtículorevisión exhaustiva

27 Citas (Scopus)

Resumen

The SEIRD (Susceptible, Exposed, Infected, Recovered, and Dead) model is a mathematical model based on dynamic equations; widely used for characterization of the COVID-19 pandemic. In this paper, a different approach has been discussed, which is the development of predictive models for the SEIRD variables that have been based on the historical data collected, and the context variables to where this model has been applied to. Particularly, the context variables examined in this paper include total population, number of people over 65 years old, poverty index, morbidity rates, average age, and population density. For the construction of the SEIRD predictive models, this study encompasses a deep analysis of the dependence of these variables and also, their relationship with the context variables. Hence, before the development of predictive models using machine learning techniques, a methodology to analyze the interdependence of the SEIRD variables has been proposed. The dependence with the context variables is also discussed; to avoid the curse of dimensionality and multicollinearity problems, leading to better results and the reduction of the computational cost. Finally, several prediction models based on varied machine learning techniques and inputs are considered, these include temporal interdependence, temporal intra-dependence, and dependence with context variables. Each of the predictive models has been studied, as well as their quality of prediction. This paper focuses on the analysis of the quality of this approach, applied in Colombia, obtaining the results about the performance of the predictive models for the SEIRD variables. The results are very encouraging since the values obtained with the quality metrics are quite good for different prediction horizons.

Idioma originalInglés
Número de artículo104500
PublicaciónComputers in Biology and Medicine
Volumen134
DOI
EstadoPublicada - jul. 2021
Publicado de forma externa

Nota bibliográfica

Publisher Copyright:
© 2021 Elsevier Ltd

Financiación

FinanciadoresNúmero del financiador
Min Ciencias Colombia
Universidad EAFIT1216101576695

    Citar esto