An intelligent early warning system of analyzing Twitter data using machine learning on COVID-19 surveillance in the US

Abstract
The World Health Organization (WHO) declared on 11th March 2020 the spread of the coronavirus disease 2019 (COVID-19) a pandemic. The traditional infectious disease surveillance had failed to alert public health authorities to intervene in time and mitigate and control the COVID-19 before it became a pandemic. Compared with traditional public health surveillance, harnessing the rich data from social media, including Twitter, has been considered a useful tool and can overcome the limitations of the traditional surveillance system. This paper proposes an intelligent COVID-19 early warning system using Twitter data with novel machine learning methods. We use the natural language processing (NLP) pre-training technique, i.e., fine-tuning BERT as a Twitter classification method. Moreover, we implement a COVID-19 forecasting model through a Twitter-based linear regression model to detect early signs of the COVID-19 outbreak. Furthermore, we develop an expert system, an early warning web application based on the proposed methods. The experimental results suggest that it is feasible to use Twitter data to provide COVID-19 surveillance and prediction in the US to support health departments’ decision-making.

All Keywords
【저자키워드】 epidemic intelligence, COVID-19 surveillance, Text classification, Early warning system, BERT, 【초록키워드】 COVID-19, coronavirus disease, public health, pandemic, media, Infectious disease, Spread, Health, COVID-19 outbreak, Surveillance, WHO, Support, World Health Organization, limitation, Linear regression model, mitigate, detect, develop, overcome, feasible, intervene, public health authority, 【제목키워드】 COVID-19, Surveillance,