FROSTY: A High-Dimensional Scale-Free Bayesian Network Learning Method

Bang, Joshua; Oh, Sang-Yun

doi:10.6339/23-JDS1097

Journal of Data Science

FROSTY: A High-Dimensional Scale-Free Bayesian Network Learning Method

Volume 21, Issue 2 (2023): Special Issue: Symposium Data Science and Statistics 2022, pp. 354–367

Joshua Bang Sang-Yun Oh

https://doi.org/10.6339/23-JDS1097

Pub. online: 14 March 2023 Type: Statistical Data Science

Open Access

Received
31 July 2022

Accepted
7 March 2023

Published
14 March 2023

Abstract

We propose a scalable Bayesian network learning algorithm based on sparse Cholesky decomposition. Our approach only requires observational data and user-specified confidence level as inputs and can estimate networks with thousands of variables. The computational complexity of the proposed method is $O({p^{3}})$ for a graph with p vertices. Extensive numerical experiments illustrate the usefulness of our method with promising results. In simulation, the initial step in our approach also improves an alternative Bayesian network structure estimation method that uses an undirected graph as an input.

Supplementary material

Supplementary Material

Supplementary material includes proofs of the theorems and the Python code for simulation.

References

Abramson B, Brown J, Edwards W, Murphy A, Winkler RL (1996). Hailfinder: A Bayesian system for forecasting severe weather. International Journal of Forecasting, 12(1): 57–71. https://doi.org/10.1016/0169-2070(95)00664-8

Amestoy PR, Davis TA, Duff IS (1996). An approximate minimum degree ordering algorithm. SIAM Journal on Matrix Analysis and Applications, 17(4): 886–905. https://doi.org/10.1137/S0895479894278952

Barabási AL, Albert R (1999). Emergence of scaling in random networks. Science, 286(5439): 509–512. https://doi.org/10.1126/science.286.5439.509

Cisneros-Velarde P, Petersen A, Oh SY (2020). Distributionally robust formulation and model selection for the graphical lasso. In: International Conference on Artificial Intelligence and Statistics, 756–765, PMLR.

Friedman J, Hastie T, Tibshirani R (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3): 432–441. https://doi.org/10.1093/biostatistics/kxm045

Friedman N (2004). Inferring cellular networks using probabilistic graphical models. Science, 303(5659): 799–805. https://doi.org/10.1126/science.1094068

Goodfellow I, Bengio Y, Courville A, Bengio Y (2016). Deep learning, volume 1. MIT press, Cambridge.

Hastie T, Tibshirani R, Friedman JH, Friedman JH (2009). The elements of statistical learning: data mining, inference, and prediction, volume 2. Springer.

Heckerman DE, Nathwani BN (1992). An evaluation of the diagnostic accuracy of pathfinder. Computers and Biomedical Research, 25(1): 56–74. https://doi.org/10.1016/0010-4809(92)90035-9

Huang X, Acero A, Hon HW, Reddy R (2001). Spoken language processing: A guide to theory, algorithm, and system development. Prentice hall PTR.

Luo J, Savakis AE, Singhal A (2005). A Bayesian network-based framework for semantic image understanding. Pattern recognition, 38(6): 919–934. https://doi.org/10.1016/j.patcog.2004.11.001

Pearl J (2014). Probabilistic reasoning in intelligent systems: networks of plausible inference. Elsevier.

Pourahmadi M (1999). Joint mean-covariance models with applications to longitudinal data: Unconstrained parameterisation. Biometrika, 86(3): 677–690. https://doi.org/10.1093/biomet/86.3.677

Raskutti G, Uhler C (2018). Learning directed acyclic graph models based on sparsest permutations. Stat, 7(1): e183. https://doi.org/10.1002/sta4.183

Robinson RW (1977). Counting unlabeled acyclic digraphs. In: Combinatorial Mathematics V: Proceedings of the Fifth Australian Conference, Held at the Royal Melbourne Institute of Technology, August 24–26, 1976, 28–43. Springer.

Rose DJ (1970). Symmetric elimination on sparse positive definite systems and the potential flow network problem, Ph.D. thesis, Harvard University.

Rose DJ (1972). A graph-theoretic study of the numerical solution of sparse positive definite systems of linear equations. In: Graph theory and computing, 183–217. Elsevier.

Rue H, Held L (2010). Discrete spatial variation. In: Handbook of spatial statistics, 171–200.

Sachs K, Perez O, Pe’er D, Lauffenburger DA, Nolan GP (2005). Causal protein-signaling networks derived from multiparameter single-cell data. Science, 308(5721): 523–529. https://doi.org/10.1126/science.1105809

Sioutis M, Condotta JF (2014). Tackling large qualitative spatial networks of scale-free-like structure. In: Artificial Intelligence: Methods and Applications: 8th Hellenic Conference on AI, SETN 2014, Ioannina, Greece, May 15–17, 2014. Proceedings 8, 178–191. Springer.

Spirtes P, Glymour CN, Scheines R, Heckerman D (2000). Causation, prediction, and search. MIT press.

Squires C, Amaniampong J, Uhler C (2020). Efficient permutation discovery in causal dags. arXiv preprint: https://arxiv.org/abs/2011.03610.

Tran C, Cisneros-Velarde P, Oh SY, Petersen A (2022). Family-wise error rate control in Gaussian graphical model selection via distributionally robust optimization. Stat, 11(1): e477, Wiley Online Library.

Van de Geer S, Bühlmann P, et al. (2013). ${\ell _{0}}$-penalized maximum likelihood for sparse directed acyclic graphs. Annals of Statistics, 41(2): 536–567.

Verma TS, Pearl J (2022). Equivalence and synthesis of causal models. In: Probabilistic and Causal Inference: The Works of Judea Pearl. 221–236.

Wainwright MJ, Jordan MI (2008). Graphical models, exponential families, and variational inference. Now Publishers Inc.

Yannakakis M (1981). Computing the minimum fill-in is np-complete. SIAM Journal on Algebraic Discrete Methods, 2(1): 77–79. https://doi.org/10.1137/0602010

IEEE Transactions on Pattern Analysis and Machine Intelligence

2023 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.

Open access article under the CC BY license.

Keywords

Bayesian networks robust selection sparse Cholesky decomposition structure learning

Metrics

since February 2021

1292

Article info
views

774

PDF
downloads

RSS

Authors

Abstract

Supplementary material

References

Export citation

Copy and paste formatted citation

Download citation in file