Background Google Trends (GT) is being used as an epidemiological tool to study coronavirus disease (COVID-19) by identifying keywords in search trends that are predictive for the COVID-19 epidemiological burden. However, many of the earlier GT-based studies include potential statistical fallacies by measuring the correlation between non-stationary time sequences without adjusting for multiple comparisons or the confounding of media coverage, leading to concerns about the increased risk of obtaining false-positive results. In this study, we aimed to apply statistically more favorable methods to validate the earlier GT-based COVID-19 study results. Methods We extracted the relative GT search volume for keywords associated with COVID-19 symptoms, and evaluated their Granger-causality to weekly COVID-19 positivity in eight English-speaking countries and Japan. In addition, the impact of media coverage on keywords with significant Granger-causality was further evaluated using Japanese regional data. Results Our Granger causality-based approach largely decreased (by up to approximately one-third) the number of keywords identified as having a significant temporal relationship with the COVID-19 trend when compared to those identified by Pearson or Spearman’s rank correlation-based approach. “Sense of smell” and “loss of smell” were the most reliable GT keywords across all the evaluated countries; however, when adjusted with their media coverage, these keyword trends did not Granger-cause the COVID-19 positivity trends (in Japan). Conclusions Our results suggest that some of the search keywords reported as candidate predictive measures in earlier GT-based COVID-19 studies may potentially be unreliable; therefore, caution is necessary when interpreting published GT-based study results. Supplementary Information The online version contains supplementary material available at 10.1186/s12874-021-01338-2.
【저자키워드】 COVID-19, Google Trends, infodemiology, Vector autoregression model, Granger causality, 【초록키워드】 coronavirus disease, media, Symptoms, Coverage, Japan, epidemiological, Japanese, correlation, trend, Google, False-positive results, Predictive, Volume, supplementary material, increased risk, sequence, measure, Pearson, approach, country, statistical, Result, include, reported, addition, evaluated, eight, adjusted, statistically, multiple comparison, with COVID-19, 【제목키워드】 Care, Google, potential risk,