The National Association of Stock Car Auto Racing (NASCAR) is ranked among the top ten most popular sports in the United States. NASCAR events are characterized by on-track racing punctuated by pit stops since cars must refuel, replace tires, and modify their setup throughout a race. A well-executed pit stop can allow drivers to gain multiple seconds on their opponents. Strategies around when to pit and what to perform during a pit stop are under constant evaluation. One currently unexplored area is publically available communication between each driver and their pit crew during the race. Due to the many hours of audio, manual analysis of even one driver’s communications is prohibitive. We propose a fully automated approach to analyze driver–pit crew communication. Our work was conducted in collaboration with NASCAR domain experts. Audio communication is converted to text and summarized using cluster-based Latent Dirichlet Analysis to provide an overview of a driver’s race performance. The transcript is then analyzed to extract important events related to pit stops and driving balance: understeer (pushing) or oversteer (over-rotating). Named entity recognition (NER) and relationship extraction provide context to each event. A combination of the race summary, events, and real-time race data provided by NASCAR are presented using Sankey visualizations. Statistical analysis and evaluation by our domain expert collaborators confirmed we can accurately identify important race events and driver interactions, presented in a novel way to provide useful, important, and efficient summaries and event highlights for race preparation and in-race decision-making.
Pub. online:4 Aug 2022Type:Research ArticleOpen Access
Journal:Journal of Data Science
Volume 18, Issue 5 (2020): Special Issue S1 in Chinese (with abstract in English), pp. 875–888
Abstract
In the wake of the COVID-19 outbreak, the public resorted to Sina Weibo as a major platform for the trend of the pandemic. Research on public sentiment and topic mining of major public sentiment events based on Sina Weibo’s comment data is important for understanding the trend of public opinions during major epidemic outbreaks. Based on classification of the Chinese language into emotion categories in psychology, we use open source tools to build naive Bayesian models to classify Weibo comments. Visualization of comment topics is achieved with word co-occurrence network methods. Commented topics are mined with the help of the latent Dirichlet distribution model. The results show that the psychological sentiment classification combined with the naive Bayesian model can reflect the evolvement of public sentiment during the epidemic, and that the latent Dirichlet distribution model and word co-occurrence network can effectively mine the topics of public concerns.
Abstract: Many nations’ defence departments use capabilitybased planning to guide their investment and divestment decisions. This planning process involves a variety of data that in its raw form is difficult for decisionmakers to use. In this paper we describe how dimensionality reduction and partition clustering are used in the Canadian Armed Forces to create visualizations that convey how important military capabilities are in planning scenarios and how much capacity the planned force structure has to provide the capabilities. Together, these visualizations give decisionmakers an overview of which capabilities may require investment or may be candidates for divestment.