Pub. online:28 Oct 2025Type:Data Science ConversationOpen Access
Journal:Journal of Data Science
Volume 23, Issue 4 (2025): Special Issue: Statistical Frontiers of Data Science, pp. 695–715
Abstract
Over the past three decades, the discipline of statistics has undergone profound transformation, driven by the rapid emergence of data science and artificial intelligence. These developments have reshaped methodological paradigms and introduced new challenges and opportunities for statistical education, particularly in China. In this context, Professor Xizhi Wu from the School of Statistics at Renmin University of China has remained closely engaged with the evolving landscape, demonstrating keen insight and a forward-looking perspective. Through sustained contributions to teaching, research, and educational reform, Professor Wu has deeply influenced generations of students and educators, playing a pivotal role in the advancement of statistical education. To document and reflect on this legacy, the Capital of Statistics conducted an in-depth interview with Professor Wu, focusing on his academic trajectory, professional contributions, and perspectives on the future of the discipline. The conversation also recounts meaningful interactions with his students, offering a multidimensional portrait of a life devoted to statistics.
A challenge that data scientists face is building an analytic product that is useful and trustworthy for a given audience. Previously, a set of principles for describing data analyses were defined that can be used to create a data analysis and to characterize the variation between analyses. Here, we introduce a concept called the alignment of a data analysis, which is between the data analyst and an audience. We define an aligned data analysis as the matching of principles between the analyst and the audience for whom the analysis is developed. In this paper, we propose a model for evaluating the alignment of a data analysis and describe some of its properties. We argue that more generally, this framework provides a language for characterizing alignment and can be used as a guide for practicing data scientists to building better data products.
Dr. David S. Salsburg’s career has been an exceptional one. He was the first statistician to work in Pfizer, Inc., and later became the first statistician from the pharmaceutical industry to be elected as an ASA fellow. He played a vital role as a statistician in Pfizer, Inc. at a time when the drug approval process was developed. For his contributions, Dr. Salsburg was awarded the Career Achievement Award of the Biostatistics Section of the Pharmaceutical Research and Manufacturers of America in 1994, for “significant contributions to the advancement of biostatistics in the pharmaceutical industry”. Dr. Salsburg also managed to achieve something rare among scientists, which is to popularize his field of research and make it accessible and enjoyable to laypeople. Dr. Salsburg is possibly best known for his book “The Lady Tasting Tea – How Statistics Revolutionized the 20th Century Science”, in which he combines simple and engaging explanations of statistical methods, and why they are needed, along with personal stories told with a great deal of generosity, fondness, and humor about the people who developed them. Dr. Salsburg’s admiration for the those statisticians shines through. In this interview, Dr. Salsburg shares his own stories and perspectives, from his childhood, through his service in the Navy and his long and productive career in Pfizer, Inc. to his equally productive retirement, in which he authored “The Lady Tasting Tea” and other books.
The ultrasonic testing has been considered a promising method for diagnosing and characterizing masonry walls. As ultrasonic waves tend to travel faster in denser materials, their use is common in evaluating the conditions of various materials. Presence of internal voids, e.g., would alter the wave path, and this distinct behavior could be employed to identify unknown conditions within the material, allowing for the assessment of its condition. Therefore, we applied mixed models and Gaussian processes to analyze the behavior of ultrasonic waves on masonry walls and identify relevant factors impacting their propagation. We observed that the average propagation time behavior differs depending on the material for both models. Additionally, the condition of the wall influences the propagation time. Gaussian process and mixed model performances are compared, and we conclude that these models can be useful in a classification model to automatically identify anomalies within masonry walls.
Pub. online:22 Feb 2021Type:Data Science In ActionOpen Access
Journal:Journal of Data Science
Volume 19, Issue 2 (2021): Special issue: Continued Data Science Contributions to COVID-19 Pandemic, pp. 334–347
Abstract
Coronavirus and the COVID-19 pandemic have substantially altered the ways in which people learn, interact, and discover information. In the absence of everyday in-person interaction, how do people self-educate while living in isolation during such times? More specifically, do communities emerge in Google search trends related to coronavirus? Using a suite of network and community detection algorithms, we scrape and mine all Google search trends in America related to an initial search for “coronavirus,” starting with the first Google search on the term (January 16, 2020) to recently (August 11, 2020). Results indicate a near-constant shift in the structure of how people educate themselves on coronavirus. Queries in the earliest days focusing on “Wuhan” and “China”, then shift to “stimulus checks” at the height of the virus in the U.S., and finally shift to queries related to local surges of new cases in later days. A few communities emerge surrounding terms more overtly related to coronavirus (e.g., “cases”, “symptoms”, etc.). Yet, given the shift in related Google queries and the broader information environment, clear community structure for the full search space does not emerge.