Journal of Data Science logo


Login Register

  1. Home
  2. Issues
  3. Volume 22, Issue 4 (2024)
  4. Multi-Dimensional Clustering Based on Re ...

Journal of Data Science

Submit your article Information
  • Article info
  • Related articles
  • More
    Article info Related articles

Multi-Dimensional Clustering Based on Restricted Distance-Dependent Mixture Dirichlet Process for Diffusion Tensor Imaging
Volume 22, Issue 4 (2024), pp. 537–557
Soyun Park   Jihnhee Yu   Zohi Sternberg  

Authors

 
Placeholder
https://doi.org/10.6339/24-JDS1125
Pub. online: 2 April 2024      Type: Statistical Data Science      Open accessOpen Access

Received
22 September 2023
Accepted
12 March 2024
Published
2 April 2024

Abstract

Brain imaging research poses challenges due to the intricate structure of the brain and the absence of clearly discernible features in the images. In this study, we propose a technique for analyzing brain image data identifying crucial regions relevant to patients’ conditions, specifically focusing on Diffusion Tensor Imaging data. Our method utilizes the Bayesian Dirichlet process prior incorporating generalized linear models, that enhances clustering performance while it benefits from the flexibility of accommodating varying numbers of clusters. Our approach improves the performance of identifying potential classes utilizing locational information by considering the proximity between locations as clustering constraints. We apply our technique to a dataset from Transforming Research and Clinical Knowledge in Traumatic Brain Injury study, aiming to identify important regions in the brain’s gray matter, white matter, and overall brain tissue that differentiate between young and old age groups. Additionally, we explore a link between our discoveries and the existing outcomes in the field of brain network research.

Supplementary material

 Supplementary Material
Supplementary Materials include a MCMC algorithm for RDMDP method, a simulation study for 100 replications, explanation of the connection between identified brain clusters and the region of interest, and the statement regarding R code for the RDMDP method.

References

 
Ahmed A, Xing E (2008). Dynamic non-parametric mixture models and the recurrent Chinese restaurant process: With applications to evolutionary clustering. In: Proceedings of the 2008 Siam International Conference on Data Mining, 219–230. SIAM.
 
Albert JH, Chib S (1993). Bayesian analysis of binary and polychotomous response data. Journal of the American Statistical Association, 88(422): 669–679. https://doi.org/10.1080/01621459.1993.10476321
 
Antoniak CE (1974). Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. The Annals of Statistics, 2(6): 1152–1174.
 
Baldassano C, Beck DM, Fei-Fei L (2015). Parcellating connectivity in spatial maps. PeerJ, 3: e784. https://doi.org/10.7717/peerj.784
 
Basser PJ, Jones DK (2002). Diffusion-tensor mri: Theory, experimental design and data analysis–a technical review. NMR in Biomedicine: An International Journal Devoted to the Development and Application of Magnetic Resonance In Vivo, 15(7–8): 456–467.
 
Bernard JA, Seidler RD (2014). Moving forward: Age effects on the cerebellum underlie cognitive and motor declines. Neuroscience and Biobehavioral Reviews, 42: 193–207. https://doi.org/10.1016/j.neubiorev.2014.02.011
 
Blei DM, Frazier PI (2011). Distance dependent Chinese restaurant processes. Journal of Machine Learning Research, 12(8): 2461–2488.
 
Blei DM, Griffiths TL, Jordan MI (2007). The nested Chinese restaurant process and hierarchical topic models. Journal of the ACM, 57(2): 1–30. https://doi.org/10.1145/1667053.1667056
 
Creswell R, Robinson M, Gavaghan D, Parag KV, Lei CL, Lambert B (2023). A Bayesian nonparametric method for detecting rapid changes in disease transmission. Journal of Theoretical Biology, 558: 111351. https://doi.org/10.1016/j.jtbi.2022.111351
 
Dahl DB (2009). Modal clustering in a class of product partition models. Bayesian Analysis, 4(2): 243–264. https://doi.org/10.1214/09-BA409
 
Daniel Loyal J, Chen Y (2023). A Bayesian nonparametric latent space approach to modeling evolving communities in dynamic networks. Bayesian Analysis, 18(1): 49–77.
 
Duan JA, Guindani M, Gelfand AE (2007). Generalized spatial Dirichlet process models. Biometrika, 94(4): 809–825. https://doi.org/10.1093/biomet/asm071
 
Elliott ML, Belsky DW, Knodt AR, Ireland D, Melzer TR, Poulton R, et al. (2021). Brain-age in midlife is associated with accelerated biological aging and cognitive decline in a longitudinal birth cohort. Molecular Psychiatry, 26(8): 3829–3838. https://doi.org/10.1038/s41380-019-0626-7
 
ElNakieb Y, Ali MT, Elnakib A, Shalaby A, Soliman A, Mahmoud A, et al. (2021). The role of diffusion tensor MR imaging (DTI) of the brain in diagnosing autism spectrum disorder: Promising results. Sensors, 21(24): 8171. https://doi.org/10.3390/s21248171
 
Escobar MD, West M (1995). Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association, 90(430): 577–588. https://doi.org/10.1080/01621459.1995.10476550
 
Fergusom T (1973). A Bayesian analysis of some nonparametric hierarchical models. The Annals of Statistics, 1: 209–230.
 
Fraley C, Raftery AE, Murphy TB, Scrucca L (2012). mclust version 4 for R: Normal mixture modeling for model-based clustering, classification, and density estimation. 597: 1.
 
Frossyniotis D, Likas A, Stafylopatis A (2004). A clustering method based on boosting. Pattern Recognition Letters, 25(6): 641–654. https://doi.org/10.1016/j.patrec.2003.12.018
 
Ghosh S, Ungureanu A, Sudderth E, Blei D (2011). Spatial distance dependent chinese restaurant processes for image segmentation. Advances in Neural Information Processing Systems, 24.
 
Griffin JE, Steel MJ (2006). Order-based dependent Dirichlet processes. Journal of the American Statistical Association, 101(473): 179–194. https://doi.org/10.1198/016214505000000727
 
Hartigan JA, Wong MA (1979). Algorithm as 136: A K-means clustering algorithm. Journal of the Royal Statistical Society. Series C. Applied Statistics, 28(1): 100–108.
 
Heller KA, Ghahramani Z (2005). Bayesian hierarchical clustering. In: Proceedings of the 22nd International Conference on Machine Learning, 297–304.
 
Jbabdi S, Johansen-Berg H (2011). Tractography: Where do we go from here? Brain Connectivity, 1(3): 169–183. https://doi.org/10.1089/brain.2011.0033
 
Jones DK, Cercignani M (2010). Twenty-five pitfalls in the analysis of diffusion MRI data. NMR in Biomedicine, 23(7): 803–820. https://doi.org/10.1002/nbm.1543
 
Kraus G (2023). Traumatic Brain Injury: A Neurosurgeon’s Perspective. CRC Press.
 
Lan Z, Reich BJ, Bandyopadhyay D (2021). A spatial Bayesian semiparametric mixture model for positive definite matrices with applications in diffusion tensor imaging. Canadian Journal of Statistics, 49(1): 129–149. https://doi.org/10.1002/cjs.11601
 
Lazar NA (2008). The Statistical Analysis of Functional MRI Data, volume 7. Springer.
 
Leritz EC, Shepel J, Williams VJ, Lipsitz LA, McGlinchey RE, Milberg WP, et al. (2014). Associations between T1 white matter lesion volume and regional white matter microstructure in aging. Human Brain Mapping, 35(3): 1085–1100. https://doi.org/10.1002/hbm.22236
 
Lu J, Li M, Dunson DB (2018). Reducing over-clustering via the powered chinese restaurant process. ArXiv preprint: https://arxiv.org/abs/1802.05392
 
MacEachern SN (2000). Dependent dirichlet processes. Unpublished manuscript, Department of Statistics, The Ohio State University, 5.
 
Masoero L, Schraiber J, Broderick T (2021). Bayesian nonparametric strategies for power maximization in rare variants association studies. ArXiv preprint: https://arxiv.org/abs/2112.02032
 
Medvedovic M, Sivaganesan S (2002). Bayesian infinite mixture model based clustering of gene expression profiles. Bioinformatics, 18(9): 1194–1206. https://doi.org/10.1093/bioinformatics/18.9.1194
 
Medvedovic M, Yeung KY, Bumgarner RE (2004). Bayesian mixture model based clustering of replicated microarray data. Bioinformatics, 20(8): 1222–1232. https://doi.org/10.1093/bioinformatics/bth068
 
Meilă M (2007). Comparing clusterings—an information based distance. Journal of Multivariate Analysis, 98(5): 873–895. https://doi.org/10.1016/j.jmva.2006.11.013
 
Neal RM (2000). Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics, 9(2): 249–265. https://doi.org/10.1080/10618600.2000.10474879
 
Oganisian A (2019). Chirp: Chinese restaurant process mixtures for regression and clustering. Journal of Open Source Software, 4(35): 1287. https://doi.org/10.21105/joss.01287
 
Oganisian A, Mitra N, Roy JA (2021). A Bayesian nonparametric model for zero-inflated outcomes: Prediction, clustering, and causal estimation. Biometrics, 77(1): 125–135. https://doi.org/10.1111/biom.13244
 
Oganisian A, Roy JA (2021). A practical introduction to Bayesian estimation of causal effects: Parametric and nonparametric approaches. Statistics in Medicine, 40(2): 518–551. https://doi.org/10.1002/sim.8761
 
Orbanz P, Teh YW (2010). Bayesian nonparametric models. In: Encyclopedia of Machine Learning. Springer.
 
Parekh MB, Gurjarpadhye AA, Manoukian MA, Dubnika A, Rajadas J, Inayathullah M (2015). Recent developments in diffusion tensor imaging of brain. Radiology Open Journal, 1(1): 1. https://doi.org/10.17140/ROJ-1-101
 
Park DC, Bischof GN (2022). The aging mind: Neuroplasticity in response to cognitive training. Dialogues in Clinical Neuroscience, 15(1): 109–119. https://doi.org/10.31887/DCNS.2013.15.1/dpark
 
Park DC, Festini SB (2016). The middle-aged brain. In: Cognitive Neuroscience of Aging, 363–388. Oxford University Press.
 
Pitman J (1995). Exchangeable and partially exchangeable random partitions. Probability Theory and Related Fields, 102(2): 145–158. https://doi.org/10.1007/BF01213386
 
Rasmussen C, De la Cruz BJ, Ghahramani Z, Wild DL (2008). Modeling and visualizing uncertainty in gene expression clusters using Dirichlet process mixtures. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 6(4): 615–628. https://doi.org/10.1109/TCBB.2007.70269
 
Raykov YP, Boukouvalas A, Little MA (2016). Simple approximate map inference for Dirichlet processes mixtures. Electronic Journal of Statistics, 10(2): 3548–3578. https://doi.org/10.1214/16-EJS1196
 
Ren Q, Wang Q, Zhang J, Chen S (2016). Unordered images selection for dense 3D reconstruction based on distance dependent Chinese restaurant process. In: 2016 12th World Congress on Intelligent Control and Automation (WCICA), 2969–2973. IEEE.
 
Rodriguez CE, Walker SG (2014). Label switching in Bayesian mixture models: Deterministic relabeling strategies. Journal of Computational and Graphical Statistics, 23(1): 25–45. https://doi.org/10.1080/10618600.2012.735624
 
Roy J, Lum KJ, Zeldow B, Dworkin JD, Re III VL, Daniels MJ (2018). Bayesian nonparametric generative models for causal inference with missing at random covariates. Biometrics, 74(4): 1193–1202. https://doi.org/10.1111/biom.12875
 
Saad F, Mansinghka V (2018). Temporally-reweighted Chinese restaurant process mixtures for clustering, imputing, and forecasting multivariate time series. In: International Conference on Artificial Intelligence and Statistics, 755–764. PMLR.
 
Saatman KE, Duhaime AC, Bullock R, Maas AI, Valadka A, Manley GT (2008). Classification of traumatic brain injury for targeted therapies. Journal of Neurotrauma, 25(7): 719–738. https://doi.org/10.1089/neu.2008.0586
 
Sadiq MU, Langella S, Giovanello KS, Mucha PJ, Dayan E (2021). Accrual of functional redundancy along the lifespan and its effects on cognition. NeuroImage, 229: 117737. https://doi.org/10.1016/j.neuroimage.2021.117737
 
Schilling KG, Daducci A, Maier-Hein K, Poupon C, Houde JC, Nath V, et al. (2019). Challenges in diffusion MRI tractography – lessons learned from international benchmark competitions. Magnetic Resonance Imaging, 57: 194–209. https://doi.org/10.1016/j.mri.2018.11.014
 
Seymour RG (2020). Bayesian nonparametric methods for individual-level stochastic epidemic models, Ph.D. thesis, University of Nottingham.
 
Soares JM, Marques P, Alves V, Sousa N (2013). A hitchhiker’s guide to diffusion tensor imaging. Frontiers in Neuroscience, 7: 31.
 
Socher R, Maas A, Manning C (2011). Spectral Chinese restaurant processes: Nonparametric clustering based on similarities. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 698–706. JMLR Workshop and Conference Proceedings.
 
Stern Y, Barnes CA, Grady C, Jones RN, Raz N (2019). Brain reserve, cognitive reserve, compensation, and maintenance: Operationalization, validity, and mechanisms of cognitive resilience. Neurobiology of Aging, 83: 124–129. https://doi.org/10.1016/j.neurobiolaging.2019.03.022
 
Teh J, Teh YW, Jordan MI, Beal MJ, Blei DM (2006). Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101(476): 1566–1581. https://doi.org/10.1198/016214506000000302
 
Wade S, Ghahramani Z (2018). Bayesian cluster analysis: Point estimation and credible balls (with discussion). Bayesian Analysis, 13(2): 559–626. https://doi.org/10.1214/17-BA1073
 
Wade S, Wade MS (2015). Package ‘mcclust. ext’. Journal of Computational and Graphical Statistics, 16: 526–558.
 
Wehrhahn C, Leonard S, Rodriguez A, Xifara T (2020). A Bayesian approach to disease clustering using restricted Chinese restaurant processes. Electronic Journal of Statistics, 14(1): 1449–1478.
 
Whitcher B, Schmid VJ, Thornton A (2011). Working with the DICOM and NIfTI data standards in R. Journal of Statistical Software, 44(6): 1–28. https://doi.org/10.18637/jss.v044.i06
 
Wilkins DG, Schultz B, Linduff KM (2009). Art Past, Art Present. Prentice Hall.
 
Xian MTS, Wade S (2022). Bayesian nonparametric scalar-on-image regression via potts-gibbs random partition models. ArXiv preprint: https://arxiv.org/abs/2206.11051

Related articles PDF XML
Related articles PDF XML

Copyright
2024 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.
by logo by logo
Open access article under the CC BY license.

Keywords
adjacency matrix Bayesian Dirichlet process prior brain imaging clustering pattern recognition

Metrics
since February 2021
304

Article info
views

245

PDF
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

Journal of data science

  • Online ISSN: 1683-8602
  • Print ISSN: 1680-743X

About

  • About journal

For contributors

  • Submit
  • OA Policy
  • Become a Peer-reviewer

Contact us

  • JDS@ruc.edu.cn
  • No. 59 Zhongguancun Street, Haidian District Beijing, 100872, P.R. China
Powered by PubliMill  •  Privacy policy