COVID-19 is a disease caused by the severe acute respiratory syndrome coronavirus 2 (SARSCoV-2) that was reported to spread in people in December 2019. Understanding epidemiological
features of COVID-19 is important for the ongoing global efforts to contain the virus. As a
complement to the available work, in this article we analyze the Kaggle novel coronavirus dataset
of 3397 patients dated from January 22, 2020 to March 29, 2020. We employ semiparametric
and nonparametric survival models as well as text mining and data visualization techniques to
examine the clinical manifestations and epidemiological features of COVID-19. Our analysis
shows that: (i) the median incubation time is about 5 days and older people tend to have a
longer incubation period; (ii) the median time for infected people to recover is about 20 days,
and the recovery time is significantly associated with age but not gender; (iii) the fatality rate
is higher for older infected patients than for younger patients
Law and legal studies has been an exciting new field for data science applications whereas the technological advancement also has profound implications for legal practice. For example, the legal industry has accumulated a rich body of high quality texts, images and other digitised formats, which are ready to be further processed and analysed by data scientists. On the other hand, the increasing popularity of data science has been a genuine challenge to legal practitioners, regulators and even general public and has motivated a long-lasting debate in the academia focusing on issues such as privacy protection and algorithmic discrimination. This paper collects 1236 journal articles involving both law and data science from the platform Web of Science to understand the patterns and trends of this interdisciplinary research field in terms of English journal publications. We find a clear trend of increasing publication volume over time and a strong presence of high-impact law and political science journals. We then use the Latent Dirichlet Allocation (LDA) as a topic modelling method to classify the abstracts into four topics based on the coherence measure. The four topics identified confirm that both challenges and opportunities have been investigated in this interdisciplinary field and help offer directions for future research.
Coronavirus and the COVID-19 pandemic have substantially altered the ways in which people learn, interact, and discover information. In the absence of everyday in-person interaction, how do people self-educate while living in isolation during such times? More specifically, do communities emerge in Google search trends related to coronavirus? Using a suite of network and community detection algorithms, we scrape and mine all Google search trends in America related to an initial search for “coronavirus,” starting with the first Google search on the term (January 16, 2020) to recently (August 11, 2020). Results indicate a near-constant shift in the structure of how people educate themselves on coronavirus. Queries in the earliest days focusing on “Wuhan” and “China”, then shift to “stimulus checks” at the height of the virus in the U.S., and finally shift to queries related to local surges of new cases in later days. A few communities emerge surrounding terms more overtly related to coronavirus (e.g., “cases”, “symptoms”, etc.). Yet, given the shift in related Google queries and the broader information environment, clear community structure for the full search space does not emerge.