Journal:Journal of Data Science
Volume 19, Issue 2 (2021): Special issue: Continued Data Science Contributions to COVID-19 Pandemic, pp. 243–252
Abstract
The swift spread of the novel coronavirus is largely attributed to its stealthy transmissions in which infected patients may be asymptomatic or exhibit only flu-like symptoms in the early stage. Undetected transmissions present a remarkable challenge for the containment of the virus and pose an appalling threat to the public. An urgent question is on testing of the coronavirus. In this paper, we evaluate the situation from the statistical viewpoint by discussing the accuracy of test procedures and stress the importance of rationally interpreting test results.
Journal:Journal of Data Science
Volume 19, Issue 2 (2021): Special issue: Continued Data Science Contributions to COVID-19 Pandemic, pp. 178–196
Abstract
The United States has the highest numbers of confirmed cases of COVID-19 in the world. The early hot spot states were New York, New Jersey, and Connecticut. The workforce in these states was required to work from home except for essential services. It was necessary to evaluate an appropriate date for resumption of business since the premature reopening of the economy would lead to a broader spread of COVID-19, while the opposite situation would cause greater loss of economy. To reflect the real-time risk of the spread of COVID-19, it was crucial to evaluate the population of infected individuals before or never being confirmed due to the pre-symptomatic and asymptomatic transmissions of COVID-19. To this end, we proposed an epidemic model and applied it to evaluate the real-time risk of epidemic for the states of New York, New Jersey, and Connecticut. We used California as the benchmark state because California began a phased reopening on May 8, 2020. The dates on which the estimated numbers of unidentified infectious individuals per 100,000 for states of New York, New Jersey, and Connecticut were close to those in California on May 8, 2020, were June 1, 22, and 22, 2020, respectively. By the practice in California, New York, New Jersey, and Connecticut might consider reopening their business. Meanwhile, according to our simulation models, to prevent resurgence of infections after reopening the economy, it would be crucial to maintain sufficient measures to limit the social distance after the resumption of businesses. This precaution turned out to be critical as the situation in California quickly deteriorated after our analysis was completed and its interventions after the reopening of business were not as effective as those in New York, New Jersey, and Connecticut.
Previous abstractive methods apply sequence-to-sequence structures to generate summary without a module to assist the system to detect vital mentions and relationships within a document. To address this problem, we utilize semantic graph to boost the generation performance. Firstly, we extract important entities from each document and then establish a graph inspired by the idea of distant supervision (Mintz et al., 2009). Then, we combine a Bi-LSTM with a graph encoder to obtain the representation of each graph node. A novel neural decoder is presented to leverage the information of such entity graphs. Automatic and human evaluations show the effectiveness of our technique.