Black-box machine learning models are recognized as useful tools for prediction applications, but the algorithmic complexity of some models causes interpretation challenges. Explainability methods have been proposed to provide insight into these models, but there is little research focused on supervised modeling with functional data inputs. We argue that, especially in applications of high consequence, it is important to explicitly model the functional dependence in a black-box analysis to not obscure or misrepresent patterns in explanations. As such, we propose the Variable importance Explainable Elastic Shape Analysis (VEESA) pipeline for training supervised machine learning models with functional inputs. The pipeline is an analysis process that includes the data preprocessing, modeling, and post-hoc explanations. The preprocessing is done using elastic functional principal components analysis, which accounts for vertical and horizontal variability in functional data and, ultimately, allows for explanations in the original data space that identify the important functional variability without bias due to correlated variables. Here, we demonstrate the pipeline on two high-consequence applications: explosives classification for national security and inkjet printer identification in forensic science. The applications exhibit the VEESA pipeline’s ability to provide an understanding of the characteristics of the functional data useful for prediction. Code for implementing the pipeline is available in the veesa R package (and supplemental python code).
Society’s capacity for algorithmic problem-solving has never been greater. Artificial Intelligence is now applied across more domains than ever, a consequence of powerful abstractions, abundant data, and accessible software. As capabilities have expanded, so have risks, with models often deployed without fully understanding their potential impacts. Interpretable and interactive machine learning aims to make complex models more transparent and controllable, enhancing user agency. This review synthesizes key principles from the growing literature in this field. We first introduce precise vocabulary for discussing interpretability, like the distinction between glass box and explainable models. We then explore connections to classical statistical and design principles, like parsimony and the gulfs of interaction. Basic explainability techniques – including learned embeddings, integrated gradients, and concept bottlenecks – are illustrated with a simple case study. We also review criteria for objectively evaluating interpretability approaches. Throughout, we underscore the importance of considering audience goals when designing interactive data-driven systems. Finally, we outline open challenges and discuss the potential role of data science in addressing them. Code to reproduce all examples can be found at https://go.wisc.edu/3k1ewe.