Journal of Data Science logo


Login Register

  1. Home
  2. Issues
  3. Volume 21, Issue 3 (2023): Special Issue: Advances in Network Data Science
  4. A Mixed-Membership Model for Social Netw ...

Journal of Data Science

Submit your article Information
  • Article info
  • More
    Article info

A Mixed-Membership Model for Social Network Clustering
Volume 21, Issue 3 (2023): Special Issue: Advances in Network Data Science, pp. 508–522
Guang Ouyang   Dipak K. Dey   Panpan Zhang ORCID icon link to view author Panpan Zhang details  

Authors

 
Placeholder
https://doi.org/10.6339/23-JDS1109
Pub. online: 7 August 2023      Type: Statistical Data Science      Open accessOpen Access

Received
7 November 2021
Accepted
4 July 2023
Published
7 August 2023

Abstract

We propose a simple mixed membership model for social network clustering in this paper. A flexible function is adopted to measure affinities among a set of entities in a social network. The model not only allows each entity in the network to possess more than one membership, but also provides accurate statistical inference about network structure. We estimate the membership parameters using an MCMC algorithm. We evaluate the performance of the proposed algorithm by applying our model to two empirical social network data, the Zachary club data and the bottlenose dolphin network data. We also conduct some numerical studies based on synthetic networks for further assessing the effectiveness of our algorithm. In the end, some concluding remarks and future work are addressed briefly.

Supplementary material

 Supplementary Material
The codes for Algorithm 1 and the implementations can be found on the journal website. The results of empirical data applications are saved in RDS files.

References

 
Abbe E (2018). Community detection and stochastic block models: Recent developments. Journal of Machine Learning Research, 18(177): 1–86.
 
Airoldi EM, Blei DM, Fienberg SE, Xing EP (2008). Mixed membership stochastic blockmodels. Journal of Machine Learning Research, 9(65): 888–896.
 
Athreya A, Fishkind DE, Tang M, Priebe CE, Park Y, Vogelstein JT, et al. (2018). Statistical inference on random dot product graphs: A survey. Journal of Machine Learning Research, 18(226): 1–92.
 
Barabási AL, Albert R (1999). Emergence of scaling in random networks. Nature, 286(5439): 509–512.
 
Betancourt M (2017). A conceptual introduction to Hamiltonian Monte Carlo. arXiv preprint: https://arxiv.org/abs/1701.02434
 
Bickle PJ, Chen A (2009). A nonparametric view of network models and Newman-Girvan and other modularities. Proceedings of the National Academy of Sciences of the United States of America, 160(50): 21068–21073. https://doi.org/10.1073/pnas.0907096106
 
Blei DM, Ng AY, Jordan MI (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3: 993–1022.
 
Boguñá M, Pastor-Satorras R, Díaz-Guilera A, Arenas A (2004). Models of social networks based on social distance attachment. Physical Review E, 70(5): 056122. https://doi.org/10.1103/PhysRevE.70.056122
 
Cartwright D, Harary F (1956). Structure balance: A generalization of Heider’s theory. Psychological Review, 63(5): 277–293. https://doi.org/10.1037/h0046049
 
Casella G, George EI (1992). Explaining the Gibbs sampler. The American Statistician, 46(3): 167–174. https://doi.org/10.1080/00031305.1992.10475878
 
Freeman LC (1977). A set of measures of centrality based on betweenness. Sociometry, 40(1): 35–41. https://doi.org/10.2307/3033543
 
Fronczak P, Fronczak A, Bujok M (2013). Exponential random graph models for networks with community structure. Physical Review E, 88(3): 032810. https://doi.org/10.1103/PhysRevE.88.032810
 
Gao C, Ma Z, Zhang AY, Zhou HH (2018). Community detection in degree-corrected block models. The Annals of Statistics, 46(5): 2153–2185.
 
Gelfand AE, Smith AFE (1990). Sampling-based approaches to calculating marginal densities. Journal of the American Statistical Association, 85(410): 398–409. https://doi.org/10.1080/01621459.1990.10476213
 
Geman S, Genman D (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(6): 721–741. https://doi.org/10.1109/TPAMI.1984.4767596
 
Geng J, Bhattacharya A, Pati D (2019). Probabilistic community detection with unknown number of communities. Journal of the American Statistical Association, 114(526): 893–905. https://doi.org/10.1080/01621459.2018.1458618
 
Gilbert EN (1959). Random graphs. The Annals of Mathematical Statistics, 30(4): 1141–1144. https://doi.org/10.1214/aoms/1177706098
 
Girvan M, Newman MEJ (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences of the United States of America, 99(12): 7821–7826. https://doi.org/10.1073/pnas.122653799
 
Handcock MS, Raftery AE (2007). Model-based clustering for social networks. Journal of the Royal Statistical Society. Series A. Statistics in Society, 170: 301–354. https://doi.org/10.1111/j.1467-985X.2007.00471.x
 
Harary F (1953). On the notion of balance of a signed graph. The Michigan Mathematical Journal, 2(2): 143–146. https://doi.org/10.1307/mmj/1028989917
 
Hoff PD, Raftery AE, Handcock MS (2002). Latent space approaches to social network analysis. Journal of the American Statistical Association, 97(460): 1090–1098. https://doi.org/10.1198/016214502388618906
 
Holland PW, Laskey KB, Leinhardt S (1983). Stochastic blockmodels: First steps. Social Networks, 5(2): 109–137. https://doi.org/10.1016/0378-8733(83)90021-7
 
Huang W, Liu Y, Chen Y (2020). Mixed membership stochastic blockmodels for heterogeneous networks. Bayesian Analysis, 15(3): 711–736. https://doi.org/10.1214/19-BA1163
 
Hunter DR, Handcock MS, Butts CT, Goodreau Morris M SM (2008). ergm: A package to fit, simulate and diagnose exponential-family models for networks. Journal of Statistical Software, 24(3): 1–29. https://doi.org/10.18637/jss.v024.i03
 
Lusseau D, Newman MEJ (2004). Identifying the role that animals play in their social networks. Proceedings of the Royal Society B, 271(supp(6)): 477–481.
 
Lusseau D, Schneider K, Boisseau OJ, Haase P, Slooten E, Dawson SM (2003). The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations. Behavioral Ecology and Sociobiology, 54: 396–405. https://doi.org/10.1007/s00265-003-0651-y
 
Lyzinski V, Tang M, Athreya A, Park Y, Priebe CE (2017). Community detection and classification in hierarchical stochastic blockmodels. IEEE Transactions on Network Science and Engineering, 4(1): 13–26. https://doi.org/10.1109/TNSE.2016.2634322
 
Marchette DJ, Priebe CE (2008). Predicting unobserved links in incompletely observed networks. Computational Statistics & Data Analysis, 52(3): 1373–1386. https://doi.org/10.1016/j.csda.2007.03.016
 
Meilǎ M (2007). Comparing clustering—an information based distance. Journal of Multivariate Analysis, 98(5): 873–895. https://doi.org/10.1016/j.jmva.2006.11.013
 
Neal RM (2011). MCMC using Hamiltonian dynamics. In: Handbook of Markov Chain Monte Carlo (S Brooks, A Gelman, G Jones, XL Meng, eds.), 113–162. Chapman & Hall/CRC, Boca Raton, FL, USA.
 
Newman MEJ (2001). The structure of scientific collaboration networks. Proceedings of the National Academy of Sciences of the United States of America, 98(2): 404–409. https://doi.org/10.1073/pnas.98.2.404
 
Newman MEJ (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences of the United States of America, 103: 8577–8582 (2006). https://doi.org/10.1073/pnas.0601602103
 
Newman MEJ, Strogatz SH, Watts DJ (2001). Random graphs with arbitrary degree distributions and their applications. Physical Review E, 64(2): 026118. https://doi.org/10.1103/PhysRevE.64.026118
 
Newman MEJ, Watts DJ, Strogatz SH (2002). Random graph models of social networks. Proceedings of the National Academy of Sciences of the United States of America, 99(supp(1): 2566–2572. https://doi.org/10.1073/pnas.012582999
 
Ng AY, Jordan MI, Weiss Y (2001). On spectral clustering: Analysis and an algorithm. In: Advances in Neural Information Processing Systems 14 (TG Dietterich, S Becker, Z Ghahramani, eds.), 849–856. MIT Press, Cambridge, MA, USA.
 
Noroozi M, Pensky M (2022). The hierarchy of block models. Sankhya. Series A, 84: 64–107. https://doi.org/10.1007/s13171-021-00247-2
 
Nowicki K, Snijders TAB (2001). Estimation and prediction for stochastic blockstructures. Journal of the American Statistical Association, 96(455): 1077–1087. https://doi.org/10.1198/016214501753208735
 
Ouyang G, Dipak DK, Zhang P (2020). Clique-based method for social network clustering. Journal of Classification, 37: 254–274. https://doi.org/10.1007/s00357-019-9310-5
 
Rand WM (1971). Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 66(336): 846–850. https://doi.org/10.1080/01621459.1971.10482356
 
Rapoport A (1949a). Outline of a probabilistic approach to animal sociology: I. The Bulletin of Mathematical Biophysics, 11(3): 183–196. https://doi.org/10.1007/BF02478364
 
Rapoport A (1949b). Outline of a probabilistic approach to animal sociology: II. The Bulletin of Mathematical Biophysics, 11(4): 273–281. https://doi.org/10.1007/BF02477980
 
Rapoport A (1950). Outline of a probabilistic approach to animal sociology: III. The Bulletin of Mathematical Biophysics, 12(1): 7–17. https://doi.org/10.1007/BF02477340
 
Sengupta S, Chen Y (2018). A block model for node popularity in networks with community structure. Journal of the Royal Statistical Society, Series B, Statistical Methodology, 80(2): 365–386. https://doi.org/10.1111/rssb.12245
 
Sewell DK, Chen Y (2017). Latent space approaches to community detection in dynamic networks. Bayesian Analysis, 12(2): 351–377. https://doi.org/10.1214/16-BA1000
 
Shi J, Malik J (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8): 888–905. https://doi.org/10.1109/34.868688
 
Snijders TAB (2001). Statistical models for social networks. Annual Review of Sociology, 37: 131–153. https://doi.org/10.1146/annurev.soc.012809.102709
 
Snijders TAB, Nowicki K (1997). Estimation and prediction for stochastic blockmodels for graphs with latent block structure. Journal of Classification, 14: 75–100. https://doi.org/10.1007/s003579900004
 
Snijders TAB, Pattison PE, Robins GL, Handcock MS (2006). New specifications for exponential random graph models. Sociological Methodology, 36(1): 99–153. https://doi.org/10.1111/j.1467-9531.2006.00176.x
 
Toivonen R, Kovanen L, Kivelä M, Onnela JP, Saramäki J, Kaski K (2009). A comparative study of social network models: Network evolution models and nodal attribute models. Social Networks, 31(4): 240–254. https://doi.org/10.1016/j.socnet.2009.06.004
 
Watts DJ, Strogatz SH (1998). Collective dynamics of “small-world” networks. Nature, 393: 440–442. https://doi.org/10.1038/30918
 
Xie J, Kelley S, Szymański BK (2013). Overlapping community detection in networks: The state-of-the-art and comparative study. ACM Computing Surveys, 45(4): 43. https://doi.org/10.1145/2501654.2501657
 
Young SJ, Scheinerman ER (2007). Random dot product graph models for social networks. In: WAW 2007: Algorithms and Models for the Web-Graph (A Bonato, FRK Chung, eds.), 138–149. Springer, Berlin, Heidelberg.
 
Zachary WW (1977). An information flow model for conflict and fission in small groups. Journal of Anthropological Research, 33(4): 452–473. https://doi.org/10.1086/jar.33.4.3629752

PDF XML
PDF XML

Copyright
2023 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.
by logo by logo
Open access article under the CC BY license.

Keywords
cosine similarity MCMC algorithm mixed membership social network clustering stochastic blockmodels

Metrics
since February 2021
378

Article info
views

235

PDF
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

Journal of data science

  • Online ISSN: 1683-8602
  • Print ISSN: 1680-743X

About

  • About journal

For contributors

  • Submit
  • OA Policy
  • Become a Peer-reviewer

Contact us

  • JDS@ruc.edu.cn
  • No. 59 Zhongguancun Street, Haidian District Beijing, 100872, P.R. China
Powered by PubliMill  •  Privacy policy