Journal of Data Science logo


Login Register

  1. Home
  2. To appear
  3. Differentially Private Bayesian Envelope ...

Journal of Data Science

Submit your article Information
  • Article info
  • Related articles
  • More
    Article info Related articles

Differentially Private Bayesian Envelope Regression via Sufficient Statistic Perturbation
Peng Yu †   Yangdi Jiang †   Zhihua Su     All authors (6)

Authors

 
Placeholder
https://doi.org/10.6339/25-JDS1194
Pub. online: 3 October 2025      Type: Philosophies Of Data Science      Open accessOpen Access

† These two authors contributed equally to this paper.

Received
28 December 2024
Accepted
24 June 2025
Published
3 October 2025

Abstract

We propose a differentially private Bayesian framework for envelope regression, a technique that improves estimation efficiency by modelling the response as a function of a low-dimensional subspace of the predictors. Our method applies the analytic Gaussian mechanism to privatize sufficient statistics from the data, ensuring formal $(\epsilon ,\delta )$-differential privacy. We develop a tailored Gibbs sampling algorithm that performs valid Bayesian inference using only the noisy sufficient statistics. This approach leverages the envelope structure to isolate the variation in predictors that is relevant to the response, reducing estimation error compared to standard regression under the same privacy constraints. Through simulation studies, we demonstrate improved estimation accuracy and tighter credible intervals relative to a differentially private Bayesian linear regression baseline.

Supplementary material

 Supplementary Material
A compressed folder containing the code used to generate the results in Section 4 and to implement our proposed methods is available online.

References

 
Aoshima M, Shen D, Shen H, Yata K, Zhou YH, Marron JS (2018). A survey of high dimension low sample size asymptotics. Australian & New Zealand Journal of Statistics, 60: 4–19. https://doi.org/10.1111/anzs.12212
 
Aoshima M, Yata K (2017). Statistical inference for high-dimension, low-sample-size data. American Mathematical Society, Sugaku Expositions, 30: 137–158. https://doi.org/10.1090/suga/421
 
Balle B, Wang YX (2018). Improving the Gaussian mechanism for differential privacy: Analytical calibration and optimal denoising. In: International Conference on Machine Learning, 394–403. PMLR.
 
Bernstein G, Sheldon D (2018). Differentially private Bayesian inference for exponential families. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, 2924–2934. Curran Associates Inc., Red Hook, NY, USA.
 
Bernstein G, Sheldon D (2019). Differentially private bayesian linear regression. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems, 525–535. Curran Associates Inc., Red Hook, NY, USA.
 
Chanyaswad T, Dytso A, Poor HV, Mittal P (2018). MVG mechanism: Differential privacy under matrix-valued query. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, 230–246. Association for Computing Machinery, New York, NY, USA.
 
Chaudhuri K, Sarwate AD, Sinha K (2013). A near-optimal algorithm for differentially-private principal components. Journal of Machine Learning Research, 14(1): 2905–2943.
 
Cook RD, Li B, Chiaromonte F (2010). Envelope models for parsimonious and efficient multivariate linear regression. Statistica Sinica, 20(3): 927–1010.
 
Cook RD, Zhang X (2015). Foundations for envelope models and methods. Journal of the American Statistical Association, 110(510): 599–611. https://doi.org/10.1080/01621459.2014.983235
 
Dandekar A, Basu D, Bressan S (2018). Differential privacy for regularised linear regression. In: International Conference on Database and Expert Systems Applications, 483–491. Springer.
 
Doe J, Roe J (2021). Differential privacy techniques for census data analysis. Journal of Census and Demographic Analysis, 15(2): 123–137.
 
Dwork C, Kenthapadi K, McSherry F, Mironov I, Naor M (2006a). Our data, ourselves: Privacy via distributed noise generation. In: Advances in Cryptology-EUROCRYPT 2006: 24th Annual International Conference on the Theory and Applications of Cryptographic Techniques. Proceedings 25. St. Petersburg, Russia, May 28–June 1, 2006, 486–503. Springer.
 
Dwork C, McSherry F, Nissim K, Smith A (2006b). Calibrating noise to sensitivity in private data analysis. In: Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006. Proceedings 3. New York, NY, USA, March 4–7, 2006, 265–284. Springer.
 
Dwork C, Roth A (2014). The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical Computer Science, 9(3–4): 211–407.
 
Dyda A, Purcell M, Curtis S, Field E, Pillai P, Ricardo K, et al. (2021). Differential privacy for public health data: An innovative tool to optimize information sharing while protecting data confidentiality. Patterns, 2(12). https://doi.org/10.1016/j.patter.2021.100366
 
Frühwirth-Schnatter S (2006). Finite Mixture and Markov Switching Models. Springer.
 
Ju N, Awan J, Gong R, Rao V (2022). Data augmentation MCMC for Bayesian inference from privatized data. Advances in Neural Information Processing Systems, 35: 12732–12743.
 
McSherry F, Mironov I (2009). Differentially private recommender systems: Building privacy into the Netflix prize contenders. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 627–636. Association for Computing Machinery, New York, NY, USA.
 
Smith A (2008). Efficient, differentially private point estimators. arXiv preprint: https://arxiv.org/abs/0809.4794.
 
Talwar K, Thakurta A, Zhang L (2015). Nearly-optimal private lasso. In: Proceedings of the 29th International Conference on Neural Information Processing Systems, 3025–3033. MIT Press, Cambridge, MA, USA.
 
Tierney L (1994). Markov chains for exploring posterior distributions. The Annals of Statistics, 22(4): 1701–1728.
 
Wang D, Xu J (2019). On sparse linear regression in the local differential privacy model. In: International Conference on Machine Learning, 6628–6637. PMLR.
 
Yao Y, Li Z (2018). Differential privacy with bias-control limited sources. IEEE Transactions on Information Forensics and Security, 13(5): 1230–1241. https://doi.org/10.1109/TIFS.2017.2780802
 
Zhang Z, Rubinstein BIP, Dimitrakakis C (2016). On the differential privacy of bayesian inference. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2365–2371. AAAI Press.

Related articles PDF XML
Related articles PDF XML

Copyright
2025 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.
by logo by logo
Open access article under the CC BY license.

Keywords
credible interval dimension reduction MCMC statistical inference

Funding
The research received funding from the Canada CIFAR AI Chairs program, the Alberta Machine Intelligence Institute, the Natural Sciences and Engineering Council of Canada, and the Canadian Statistical Sciences Institute.

Metrics
since February 2021
96

Article info
views

72

PDF
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

Journal of data science

  • Online ISSN: 1683-8602
  • Print ISSN: 1680-743X

About

  • About journal

For contributors

  • Submit
  • OA Policy
  • Become a Peer-reviewer

Contact us

  • JDS@ruc.edu.cn
  • No. 59 Zhongguancun Street, Haidian District Beijing, 100872, P.R. China
Powered by PubliMill  •  Privacy policy