Differentially Private Bayesian Envelope Regression via Sufficient Statistic Perturbation
Pub. online: 3 October 2025
Type: Philosophies Of Data Science
Open Access
†
These two authors contributed equally to this paper.
Received
28 December 2024
28 December 2024
Accepted
24 June 2025
24 June 2025
Published
3 October 2025
3 October 2025
Abstract
We propose a differentially private Bayesian framework for envelope regression, a technique that improves estimation efficiency by modelling the response as a function of a low-dimensional subspace of the predictors. Our method applies the analytic Gaussian mechanism to privatize sufficient statistics from the data, ensuring formal $(\epsilon ,\delta )$-differential privacy. We develop a tailored Gibbs sampling algorithm that performs valid Bayesian inference using only the noisy sufficient statistics. This approach leverages the envelope structure to isolate the variation in predictors that is relevant to the response, reducing estimation error compared to standard regression under the same privacy constraints. Through simulation studies, we demonstrate improved estimation accuracy and tighter credible intervals relative to a differentially private Bayesian linear regression baseline.
Supplementary material
Supplementary MaterialA compressed folder containing the code used to generate the results in Section 4 and to implement our proposed methods is available online.
References
Aoshima M, Shen D, Shen H, Yata K, Zhou YH, Marron JS (2018). A survey of high dimension low sample size asymptotics. Australian & New Zealand Journal of Statistics, 60: 4–19. https://doi.org/10.1111/anzs.12212
Aoshima M, Yata K (2017). Statistical inference for high-dimension, low-sample-size data. American Mathematical Society, Sugaku Expositions, 30: 137–158. https://doi.org/10.1090/suga/421
Cook RD, Zhang X (2015). Foundations for envelope models and methods. Journal of the American Statistical Association, 110(510): 599–611. https://doi.org/10.1080/01621459.2014.983235
Dwork C, Kenthapadi K, McSherry F, Mironov I, Naor M (2006a). Our data, ourselves: Privacy via distributed noise generation. In: Advances in Cryptology-EUROCRYPT 2006: 24th Annual International Conference on the Theory and Applications of Cryptographic Techniques. Proceedings 25. St. Petersburg, Russia, May 28–June 1, 2006, 486–503. Springer.
Dyda A, Purcell M, Curtis S, Field E, Pillai P, Ricardo K, et al. (2021). Differential privacy for public health data: An innovative tool to optimize information sharing while protecting data confidentiality. Patterns, 2(12). https://doi.org/10.1016/j.patter.2021.100366
Smith A (2008). Efficient, differentially private point estimators. arXiv preprint: https://arxiv.org/abs/0809.4794.
Yao Y, Li Z (2018). Differential privacy with bias-control limited sources. IEEE Transactions on Information Forensics and Security, 13(5): 1230–1241. https://doi.org/10.1109/TIFS.2017.2780802