Visual Analytics for NASCAR Motorsports
Pub. online: 2 July 2024
Type: Data Science In Action
Open Access
Received
4 February 2024
4 February 2024
Accepted
7 June 2024
7 June 2024
Published
2 July 2024
2 July 2024
Abstract
The National Association of Stock Car Auto Racing (NASCAR) is ranked among the top ten most popular sports in the United States. NASCAR events are characterized by on-track racing punctuated by pit stops since cars must refuel, replace tires, and modify their setup throughout a race. A well-executed pit stop can allow drivers to gain multiple seconds on their opponents. Strategies around when to pit and what to perform during a pit stop are under constant evaluation. One currently unexplored area is publically available communication between each driver and their pit crew during the race. Due to the many hours of audio, manual analysis of even one driver’s communications is prohibitive. We propose a fully automated approach to analyze driver–pit crew communication. Our work was conducted in collaboration with NASCAR domain experts. Audio communication is converted to text and summarized using cluster-based Latent Dirichlet Analysis to provide an overview of a driver’s race performance. The transcript is then analyzed to extract important events related to pit stops and driving balance: understeer (pushing) or oversteer (over-rotating). Named entity recognition (NER) and relationship extraction provide context to each event. A combination of the race summary, events, and real-time race data provided by NASCAR are presented using Sankey visualizations. Statistical analysis and evaluation by our domain expert collaborators confirmed we can accurately identify important race events and driver interactions, presented in a novel way to provide useful, important, and efficient summaries and event highlights for race preparation and in-race decision-making.
Supplementary material
Supplementary MaterialPython code for: (1) processing the transcribed driver–pit crew text, and (2) generating a web-based visualization of important events for a user-selected race and one or more drivers have been uploaded to the GitHub repository https://github.com/cghealey/JDS. Instructions on how to run the code are shown in the README.md file.
References
Bardhan S (2023). Deploying a Flask web app on Microsoft Azure. https://medium.datadriveninvestor.com/deploying-flask-web-app-on-microsoft-azure-89cea17e9114. Accessed: 03-Feb-2023.
Bernstein S (1927). Sur l’extension du théoréme limite du calcul des probabilités aux sommes de quantités dépendantes. Mathematische Annalen, 97: 1–59. https://doi.org/10.1007/BF01447859
Bokhove C, Downey C (2018). Automated generation of ‘good enough’ transcripts as a first step to transcription of audio-recorded data. Methodological Innovations, 11(2). https://doi.org/10.1177/2059799118790743
Chaudhuri S, Das G, Srivastava U (2004). Effective use of block-level sampling in statistics estimation. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data (SIGMOD 2004) (Weikum, G, König, C, Deßlock, S, eds.), 287–298. Paris, France. 10.1145/1007568.1007602
Chen W, Lao T, Xia J, Huang X, Zhu B, Hu W, et al. (2016). GameFlow: Narrative visualization of NBA basketball games. IEEE Transactions on Multimedia, 18(11): 2247–2256. https://doi.org/10.1109/TMM.2016.2614221
Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6): 391–407. https://doi.org/10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9
Explosion Inc (2023a). prodigy. https://prodi.gy. Accessed: 06-Feb-2023.
Explosion Inc (2023b). spaCy. https://spacy.io. Accessed: 06-Feb-2023.
Healey CG, Dinakaran G, Padia K, Nie S, Benson JR, Caira D, et al. (2021). Visual analytics of text conversation sentiment and semantics. Computer Graphics Forum, 40(6): 484–499. https://doi.org/10.1111/cgf.14391
Healey CG, Enns JT (1999). Large datasets at a glance: Combining textures and colors in scientific visualization. IEEE Transactions on Visualization and Computer Graphics, 5(2): 145–167. https://doi.org/10.1109/2945.773807
Healey CG, Enns JT (2012). Attention and visual memory in visualization and computer graphics. IEEE Transactions on Visualization and Computer Graphics, 18(7): 1170–1188. https://doi.org/10.1109/TVCG.2011.127
Healey CG, Sawant AP (2012). On the limits of resolution and visual angle in visualization. ACM Transactions on Applied Perception, 9(4): 20:1–20:21. https://doi.org/10.1145/2355598.2355603
Highcharts, Inc (2023). Highcharts. https://www.highcarts.com. Accessed: 03-Feb-2023.
Hori C, Furui S (2003). A new approach to automatic speech summarization. IEEE Transactions on Multimedia, 5(3): 368–378. https://doi.org/10.1109/TMM.2003.813274
Moratanch N, Chitrakala S (2017). A survey on extractive text summarization. In: International Conference on Computer, Communication and Signal Processing (ICCCSP 2017) (Srinivasan, R, Shahina, A, Vasuki, P, Malathy, EM, AruKumar, V, Sofia Jenifer, J, Pavithra, LK, Geetha, K, eds.), 1–6. Tamalnadu, India.
OpenJS Foundation (2023). jQuery. https://jquery.com. Accessed: 03-Feb-2023.
Padia K, Bandara L, Healey CG (2019). A system for generating storyline visualizations using hierarchical task network planning. Computers & Graphics, 78: 64–75. https://doi.org/10.1016/j.cag.2018.11.004
Perin C, Vuillemot R, Stolper CD, Stasko JT, Wood J, Carpendale ST (2018). State of the art of sports data visualization. Computer Graphics Forum, 37: 663–686. https://doi.org/10.1111/cgf.13447
Pileggi H, Stolper CD, Boyle JM, Stasko JT (2012). Snapshot: Visualization to propel ice hockey analysis. IEEE Transactions on Visualization and Computer Graphics, 18(12): 2819–2828. https://doi.org/10.1109/TVCG.2012.263
Rousseeuw PJ (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Computational & Applied Mathematics, 20: 53–65. https://doi.org/10.1016/0377-0427(87)90125-7
Russell JA, Lewick M, Niit T (1989). A cross-cultural study of a circumplex model of affect. Journal of Personality and Social Psychology, 57(5): 848–856. https://doi.org/10.1037/0022-3514.57.5.848
Spar̈ch Jones K (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28(1): 1–11. https://doi.org/10.1108/eb026525
Wienrich C, Reitelbach C, Carolus A (2021). The trustworthiness of voice assistants in the context of healthcare investigating the effect of perceived expertise on the trustworthiness of voice assistants, providers, data receivers, and automatic speech recognition. Frontiers of Computer Science, 3: 1–12. 685250
Zuang H, Zhang W (2019). Generating semantically similar and human-readable summaries with generative adversarial networks. IEEE Access, 7: 169426–16943. https://doi.org/10.1109/ACCESS.2019.2955087