Journal:Journal of Data Science
Volume 19, Issue 2 (2021): Special issue: Continued Data Science Contributions to COVID-19 Pandemic, pp. 178–196
Abstract
The United States has the highest numbers of confirmed cases of COVID-19 in the world. The early hot spot states were New York, New Jersey, and Connecticut. The workforce in these states was required to work from home except for essential services. It was necessary to evaluate an appropriate date for resumption of business since the premature reopening of the economy would lead to a broader spread of COVID-19, while the opposite situation would cause greater loss of economy. To reflect the real-time risk of the spread of COVID-19, it was crucial to evaluate the population of infected individuals before or never being confirmed due to the pre-symptomatic and asymptomatic transmissions of COVID-19. To this end, we proposed an epidemic model and applied it to evaluate the real-time risk of epidemic for the states of New York, New Jersey, and Connecticut. We used California as the benchmark state because California began a phased reopening on May 8, 2020. The dates on which the estimated numbers of unidentified infectious individuals per 100,000 for states of New York, New Jersey, and Connecticut were close to those in California on May 8, 2020, were June 1, 22, and 22, 2020, respectively. By the practice in California, New York, New Jersey, and Connecticut might consider reopening their business. Meanwhile, according to our simulation models, to prevent resurgence of infections after reopening the economy, it would be crucial to maintain sufficient measures to limit the social distance after the resumption of businesses. This precaution turned out to be critical as the situation in California quickly deteriorated after our analysis was completed and its interventions after the reopening of business were not as effective as those in New York, New Jersey, and Connecticut.
Previous abstractive methods apply sequence-to-sequence structures to generate summary without a module to assist the system to detect vital mentions and relationships within a document. To address this problem, we utilize semantic graph to boost the generation performance. Firstly, we extract important entities from each document and then establish a graph inspired by the idea of distant supervision (Mintz et al., 2009). Then, we combine a Bi-LSTM with a graph encoder to obtain the representation of each graph node. A novel neural decoder is presented to leverage the information of such entity graphs. Automatic and human evaluations show the effectiveness of our technique.
Pub. online:27 Apr 2021Type:Philosophies Of Data Science
Journal:Journal of Data Science
Volume 19, Issue 2 (2021): Special issue: Continued Data Science Contributions to COVID-19 Pandemic, pp. 219–242
Abstract
The coronavirus disease 2019 (COVID-19) pandemic caused by the novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has placed epidemic modeling at the center of attention of public policymaking. Predicting the severity and speed of transmission of COVID-19 is crucial to resource management and developing strategies to deal with this epidemic. Based on the available data from current and previous outbreaks, many efforts have been made to develop epidemiological models, including statistical models, computer simulations, mathematical representations of the virus and its impacts, and many more. Despite their usefulness, modeling and forecasting the spread of COVID-19 remains a challenge. In this article, we give an overview of the unique features and issues of COVID-19 data and how they impact epidemic modeling and projection. In addition, we illustrate how various models could be connected to each other. Moreover, we provide new data science perspectives on the challenges of COVID-19 forecasting, from data collection, curation, and validation to the limitations of models, as well as the uncertainty of the forecast. Finally, we discuss some data science practices that are crucial to more robust and accurate epidemic forecasting.