Applying multiple linear regression to analyze the relationship between relative search volume and COVID-19 cases in Brazil
DOI:
https://doi.org/10.5585/exactaep.2022.20401Keywords:
COVID-19, Multiple linear regression, Data science, Google TrendsAbstract
Efforts in multiple areas of scientific knowledge, in especial infodemiology, have been dedicated to propose solutions for the socioeconomic problems caused by the SARS-CoV-2 pandemic and the social isolation measures. Early insights studies into how COVID-19 may be spreading in a particular region is an effective tool to tackle this challenge. This paper analyses in an explanatory way the relation between publicly available data from relative search volume in Google Trends® and government data about COVID-19 infection in Minas Gerais, Rio de Janeiro and São Paulo. Through these analyses, it was possible to infer that “Teste de COVID” and “Sintomas de COVID” could be used as explanatory variables to the number of cases in these places. This study provides interesting insights to support future research in the use of relative search volume as additional explanatory variables in predictive models, regarding its limitations.
Downloads
References
Angus, D. C. (2020). Optimizing the trade-off between learning and doing in a pandemic. Jama, 323(19), 1895-1896. https://doi:10.1001/jama.2020.4984
Ayyoubzadeh, S. M., Ayyoubzadeh, S. M., Zahedi, H., Ahmadi, M., & Kalhori, S. R. N. (2020). Predicting COVID-19 incidence through analysis of google trends data in iran: data mining and deep learning pilot study. JMIR public health and surveillance, 6(2), e18828. https://doi.org/10.2196/18828
Bakker, K. M., Martinez-Bakker, M. E., Helm, B., & Stevenson, T. J. (2016). Digital epidemiology reveals global childhood disease seasonality and the effects of immunization. Proceedings of the National Academy of Sciences, 113(24), 6689-6694. https://doi.org/10.1073/pnas.1523941113
Brasil, & Brasil. (2021). Conselho Nacional dos Secretários de Saúde (CONASS). Atenção Primária e promoção da Saúde.
Bregman, J. I. (1999). Environmental impact statements. CRC Press.
Buheji, M., da Costa Cunha, K., Beka, G., Mavric, B., De Souza, Y. L., da Costa Silva, S. S., ... & Yein, T. C. (2020). The extent of covid-19 pandemic socio-economic impact on global poverty. a global integrative multidisciplinary review. American Journal of Economics, 10(4), 213-224. https://doi.org/10.5923/j.economics.20201004.02
Draper, N. R., & Smith, H. (1998). Applied regression analysis (Vol. 326). John Wiley & Sons.
Eysenbach, G. (2009). Infodemiology and infoveillance: framework for an emerging set of public health informatics methods to analyze search, communication and publication behavior on the Internet. Journal of medical Internet research, 11(1), e11. https://doi.org/10.2196/jmir.1157
Ginsberg, J., Mohebbi, M. H., Patel, R. S., Brammer, L., Smolinski, M. S., & Brilliant, L. (2009). Detecting influenza epidemics using search engine query data. Nature, 457(7232), 1012-1014. https://doi.org/10.1038/nature07634
Hair, J. F., Black, W. C., Babin, B. J., Anderson, R. E., & Tatham, R. L. (2009). Análise multivariada de dados. Bookman editora.
Henson, R. K., Capraro, R. M., & Capraro, M. M. (2001). Reporting Practice and Use of Exploratory Factor Analysis in Educational Research Journals.
Instituto Brasileiro de Geografia, & Estatística. (2020). Censo demográfico 2020: características da população e dos domicílios: resultados do universo. Ministério de Planejamento, Orçamento e Gestão, Instituto Brasileiro de Geografia Estatística-IBGE.
Instituto Brasileiro de Geografia, & Estatística. Coordenação de Contas Nacionais. (2018). Produto interno bruto dos municípios. IBGE.
Junior, R. R. F., & Santa Rita, L. P. (2016). Impactos da Covid-19 na Economia: limites, desafios e políticas.Cadernos de Prospecção, vol. 13, n. 2, 2020. http://dx.doi.org/10.9771/rf.v1i7.37324
Kumar, A., Sinwar, D., & Saini, M. (2020). Study of several key parameters responsible for COVID-19 outbreak using multiple regression analysis and multi-layer feed forward neural network. Journal of Interdisciplinary Mathematics, 1-23. https://doi.org/10.1080/09720502.2020.1833443
Kurian, S. J., Alvi, M. A., Ting, H. H., Storlie, C., Wilson, P. M., Shah, N. D., ... & Bydon, M. (2020, November). Correlations Between COVID-19 Cases and Google Trends Data in the United States: A State-by-State Analysis. In Mayo Clinic Proceedings (Vol. 95, No. 11, pp. 2370-2381). Elsevier. https://doi.org/10.1016/j.mayocp.2020.08.022
Lakatos, E. M., & Marconi, M. D. A. (1996). Técnicas de pesquisa. São Paulo: Atlas, 205, 88.
Lin, S., Fu, Y., Jia, X., Ding, S., Wu, Y., & Huang, Z. (2020). Discovering Correlations between the COVID-19 Epidemic Spread and Climate. International Journal of Environmental Research and Public Health, 17(21), 7958. https://doi.org/10.3390/ijerph17217958
Liu, D., Clemente, L., Poirier, C., Ding, X., Chinazzi, M., Davis, J. T., ... & Santillana, M. (2020). A machine learning methodology for real-time forecasting of the 2019-2020 COVID-19 outbreak using Internet searches, news alerts, and estimates from mechanistic models. arXiv preprint arXiv:2004.04019.
Mavragani, A., & Gkillas, K. (2020). COVID-19 predictability in the United States using Google Trends time series. Scientific reports, 10(1), 1-12. https://doi.org/10.1038/s41598-020-77275-9
McKibbin, W., & Fernando, R. (2021). The global macroeconomic impacts of COVID-19: Seven scenarios. Asian Economic Papers, 20(2), 1-30. https://doi.org/10.1162/asep_a_00796
Rabajante, J. F. (2020). Insights from early mathematical models of 2019-nCoV acute respiratory disease (COVID-19) dynamics. arXiv preprint arXiv:2002.05296.
Razali, N. M., & Wah, Y. B. (2011). Power comparisons of shapiro-wilk, kolmogorov-smirnov, lilliefors and anderson-darling tests. Journal of statistical modeling and analytics, 2(1), 21-33.
Siettos, C. I., & Russo, L. (2013). Mathematical modeling of infectious disease dynamics. Virulence, 4(4), 295-306. https://doi.org/10.4161/viru.24041
Teng, Y., Bi, D., Xie, G., Jin, Y., Huang, Y., Lin, B., ... & Tong, Y. (2017). Dynamic forecasting of Zika epidemics using Google Trends. PloS one, 12(1), e0165085. https://doi.org/10.1371/journal.pone.0165085
Tizzoni, M., Bajardi, P., Poletto, C., Ramasco, J. J., Balcan, D., Gonçalves, B., ... & Vespignani, A. (2012). Real-time numerical forecast of global epidemic spreading: case study of 2009 A/H1N1pdm. BMC medicine, 10(1), 1-31. https://doi.org/10.1186/1741-7015-10-165
WHO. Statement on the second meeting of the international health regulations (2005) emergency committee regarding the outbreak of novel coronavirus (2019-ncov). URL https://www.who.int/news-room/detail/. [Online; accessed 9-March2021].
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2022 Autores
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.