Abstract: A powerful methodology for exploring relationships among items, association rules analysis can be used to capture a set of rules from any given dataset. Little is known, however, that a single dataset can be represented by more than one set of rules, i.e., by equivalent models. In fact, most studies on the goodness of model can be misleading because they assume the model is unique. These are phenomenon that the literature has yet to explore. In our study, we demonstrate that equivalent models exist for any dataset and propose a method for converting any given model into its dominant model, recommended as the benchmark model. Further, we explain how the phenomenon of equivalent models affects decision tree analysis and statistical model selection. It is shown that the decision rules from decision tree analysis can always be simplified by reducing the decision rules to the dominant model. The simulated and real datasets are used for illustration.
Journal:Journal of Data Science
Volume 19, Issue 2 (2021): Special issue: Continued Data Science Contributions to COVID-19 Pandemic, pp. 243–252
Abstract
The swift spread of the novel coronavirus is largely attributed to its stealthy transmissions in which infected patients may be asymptomatic or exhibit only flu-like symptoms in the early stage. Undetected transmissions present a remarkable challenge for the containment of the virus and pose an appalling threat to the public. An urgent question is on testing of the coronavirus. In this paper, we evaluate the situation from the statistical viewpoint by discussing the accuracy of test procedures and stress the importance of rationally interpreting test results.