Pub. online:2 May 2024Type:Data Science In ActionOpen Access
Journal:Journal of Data Science
Volume 22, Issue 2 (2024): Special Issue: 2023 Symposium on Data Science and Statistics (SDSS): “Inquire, Investigate, Implement, Innovate”, pp. 191–207
Abstract
Attention Deficit Hyperactivity Disorder (ADHD) is a frequent neurodevelopmental disorder in children that is commonly diagnosed subjectively. The objective detection of ADHD based on neuroimaging data has been a complex problem with low ranges of accuracy, possibly due to (among others) complex diagnostic processes, the high number of features considered and imperfect measurements in data collection. Hence, reliable neuroimaging biomarkers for detecting ADHD have been elusive. To address this problem we consider a recently proposed multi-model selection method called Sparse Wrapper AlGorithm (SWAG), which is a greedy algorithm that combines screening and wrapper approaches to create a set of low-dimensional models with good predictive power. While preserving the previous levels of accuracy, SWAG provides a measure of importance of brain regions for identifying ADHD. Our approach also provides a set of equally-performing and simple models which highlight the main feature combinations to be analyzed and the interactions between them. Taking advantage of the network of models resulting from this approach, we confirm the relevance of the frontal and temporal lobes as well as highlight how the different regions interact to detect the presence of ADHD. In particular, these results are fairly consistent across different learning mechanisms employed within the SWAG (i.e. logistic regression, linear and radial-kernel support vector machines) thereby providing population-level insights, as well as delivering feature combinations that are smaller and often perform better than those that would be used if employing their original versions directly.