Journal of Data Science logo


Login Register

  1. Home
  2. Issues
  3. Volume 21, Issue 2 (2023): Special Issue: Symposium Data Science and Statistics 2022
  4. FROSTY: A High-Dimensional Scale-Free Ba ...

Journal of Data Science

Submit your article Information
  • Article info
  • Related articles
  • More
    Article info Related articles

FROSTY: A High-Dimensional Scale-Free Bayesian Network Learning Method
Volume 21, Issue 2 (2023): Special Issue: Symposium Data Science and Statistics 2022, pp. 354–367
Joshua Bang   Sang-Yun Oh  

Authors

 
Placeholder
https://doi.org/10.6339/23-JDS1097
Pub. online: 14 March 2023      Type: Statistical Data Science      Open accessOpen Access

Received
31 July 2022
Accepted
7 March 2023
Published
14 March 2023

Abstract

We propose a scalable Bayesian network learning algorithm based on sparse Cholesky decomposition. Our approach only requires observational data and user-specified confidence level as inputs and can estimate networks with thousands of variables. The computational complexity of the proposed method is $O({p^{3}})$ for a graph with p vertices. Extensive numerical experiments illustrate the usefulness of our method with promising results. In simulation, the initial step in our approach also improves an alternative Bayesian network structure estimation method that uses an undirected graph as an input.

Supplementary material

 Supplementary Material
Supplementary material includes proofs of the theorems and the Python code for simulation.

References

 
Abramson B, Brown J, Edwards W, Murphy A, Winkler RL (1996). Hailfinder: A Bayesian system for forecasting severe weather. International Journal of Forecasting, 12(1): 57–71. https://doi.org/10.1016/0169-2070(95)00664-8
 
Amestoy PR, Davis TA, Duff IS (1996). An approximate minimum degree ordering algorithm. SIAM Journal on Matrix Analysis and Applications, 17(4): 886–905. https://doi.org/10.1137/S0895479894278952
 
Barabási AL, Albert R (1999). Emergence of scaling in random networks. Science, 286(5439): 509–512. https://doi.org/10.1126/science.286.5439.509
 
Cisneros-Velarde P, Petersen A, Oh SY (2020). Distributionally robust formulation and model selection for the graphical lasso. In: International Conference on Artificial Intelligence and Statistics, 756–765, PMLR.
 
Friedman J, Hastie T, Tibshirani R (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3): 432–441. https://doi.org/10.1093/biostatistics/kxm045
 
Friedman N (2004). Inferring cellular networks using probabilistic graphical models. Science, 303(5659): 799–805. https://doi.org/10.1126/science.1094068
 
Goodfellow I, Bengio Y, Courville A, Bengio Y (2016). Deep learning, volume 1. MIT press, Cambridge.
 
Hastie T, Tibshirani R, Friedman JH, Friedman JH (2009). The elements of statistical learning: data mining, inference, and prediction, volume 2. Springer.
 
Heckerman DE, Nathwani BN (1992). An evaluation of the diagnostic accuracy of pathfinder. Computers and Biomedical Research, 25(1): 56–74. https://doi.org/10.1016/0010-4809(92)90035-9
 
Huang X, Acero A, Hon HW, Reddy R (2001). Spoken language processing: A guide to theory, algorithm, and system development. Prentice hall PTR.
 
Luo J, Savakis AE, Singhal A (2005). A Bayesian network-based framework for semantic image understanding. Pattern recognition, 38(6): 919–934. https://doi.org/10.1016/j.patcog.2004.11.001
 
Pearl J (2014). Probabilistic reasoning in intelligent systems: networks of plausible inference. Elsevier.
 
Pourahmadi M (1999). Joint mean-covariance models with applications to longitudinal data: Unconstrained parameterisation. Biometrika, 86(3): 677–690. https://doi.org/10.1093/biomet/86.3.677
 
Raskutti G, Uhler C (2018). Learning directed acyclic graph models based on sparsest permutations. Stat, 7(1): e183. https://doi.org/10.1002/sta4.183
 
Robinson RW (1977). Counting unlabeled acyclic digraphs. In: Combinatorial Mathematics V: Proceedings of the Fifth Australian Conference, Held at the Royal Melbourne Institute of Technology, August 24–26, 1976, 28–43. Springer.
 
Rose DJ (1970). Symmetric elimination on sparse positive definite systems and the potential flow network problem, Ph.D. thesis, Harvard University.
 
Rose DJ (1972). A graph-theoretic study of the numerical solution of sparse positive definite systems of linear equations. In: Graph theory and computing, 183–217. Elsevier.
 
Rue H, Held L (2010). Discrete spatial variation. In: Handbook of spatial statistics, 171–200.
 
Sachs K, Perez O, Pe’er D, Lauffenburger DA, Nolan GP (2005). Causal protein-signaling networks derived from multiparameter single-cell data. Science, 308(5721): 523–529. https://doi.org/10.1126/science.1105809
 
Sioutis M, Condotta JF (2014). Tackling large qualitative spatial networks of scale-free-like structure. In: Artificial Intelligence: Methods and Applications: 8th Hellenic Conference on AI, SETN 2014, Ioannina, Greece, May 15–17, 2014. Proceedings 8, 178–191. Springer.
 
Spirtes P, Glymour CN, Scheines R, Heckerman D (2000). Causation, prediction, and search. MIT press.
 
Squires C, Amaniampong J, Uhler C (2020). Efficient permutation discovery in causal dags. arXiv preprint: https://arxiv.org/abs/2011.03610.
 
Tran C, Cisneros-Velarde P, Oh SY, Petersen A (2022). Family-wise error rate control in Gaussian graphical model selection via distributionally robust optimization. Stat, 11(1): e477, Wiley Online Library.
 
Van de Geer S, Bühlmann P, et al. (2013). ${\ell _{0}}$-penalized maximum likelihood for sparse directed acyclic graphs. Annals of Statistics, 41(2): 536–567.
 
Verma TS, Pearl J (2022). Equivalence and synthesis of causal models. In: Probabilistic and Causal Inference: The Works of Judea Pearl. 221–236.
 
Wainwright MJ, Jordan MI (2008). Graphical models, exponential families, and variational inference. Now Publishers Inc.
 
Yannakakis M (1981). Computing the minimum fill-in is np-complete. SIAM Journal on Algebraic Discrete Methods, 2(1): 77–79. https://doi.org/10.1137/0602010
 
IEEE Transactions on Pattern Analysis and Machine Intelligence

Related articles PDF XML
Related articles PDF XML

Copyright
2023 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.
by logo by logo
Open access article under the CC BY license.

Keywords
Bayesian networks robust selection sparse Cholesky decomposition structure learning

Metrics
since February 2021
774

Article info
views

282

PDF
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

Journal of data science

  • Online ISSN: 1683-8602
  • Print ISSN: 1680-743X

About

  • About journal

For contributors

  • Submit
  • OA Policy
  • Become a Peer-reviewer

Contact us

  • JDS@ruc.edu.cn
  • No. 59 Zhongguancun Street, Haidian District Beijing, 100872, P.R. China
Powered by PubliMill  •  Privacy policy