Supplementary Material

JDS

Journal of Data Science

1683-86021680-743X

1680-743X

School of Statistics, Renmin University of China

JDS1194

10.6339/25-JDS1194

Statistical Data Science

Differentially Private Bayesian Envelope Regression via Sufficient Statistic Perturbation

Peng

1† Jiang

Yangdi

1† Su

Zhihua

2 Wu

Jiamei

3 Kong

Lingchen

https://orcid.org/0000-0002-0033-839X

Jiang

Bei

bei1@ualberta.ca1∗ 1Department of Mathematical and Statistical Sciences, University of Alberta, Edmonton, Canada 2nVerses Capital, Wellington, FL, USA 3School of Mathematics and Statistics, Beijing Jiaotong University, Beijing, China

∗Corresponding author. Email: bei1@ualberta.ca.†

These two authors contributed equally to this paper.

2026

3102025

241187202

Supplementary Material

A compressed folder containing the code used to generate the results in Section 4 and to implement our proposed methods is available online.

281220242462025

2026 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.

2026

Open access article under the CC BY license.

We propose a differentially private Bayesian framework for envelope regression, a technique that improves estimation efficiency by modelling the response as a function of a low-dimensional subspace of the predictors. Our method applies the analytic Gaussian mechanism to privatize sufficient statistics from the data, ensuring formal ( ϵ , δ )-differential privacy. We develop a tailored Gibbs sampling algorithm that performs valid Bayesian inference using only the noisy sufficient statistics. This approach leverages the envelope structure to isolate the variation in predictors that is relevant to the response, reducing estimation error compared to standard regression under the same privacy constraints. Through simulation studies, we demonstrate improved estimation accuracy and tighter credible intervals relative to a differentially private Bayesian linear regression baseline.

Keywords credible interval dimension reduction MCMC statistical inference

The research received funding from the Canada CIFAR AI Chairs program, the Alberta Machine Intelligence Institute, the Natural Sciences and Engineering Council of Canada, and the Canadian Statistical Sciences Institute.

References

Aoshima

, Shen

, Yata

, Zhou

, Marron

(2018). A survey of high dimension low sample size asymptotics. Australian & New Zealand Journal of Statistics, 60: 4–19. https://doi.org/10.1111/anzs.12212

Aoshima

, Yata

(2017). Statistical inference for high-dimension, low-sample-size data. American Mathematical Society, Sugaku Expositions, 30: 137–158. https://doi.org/10.1090/suga/421

Balle

, Wang

(2018). Improving the Gaussian mechanism for differential privacy: Analytical calibration and optimal denoising. In: International Conference on Machine Learning, 394–403. PMLR.

Bernstein

, Sheldon

(2018). Differentially private Bayesian inference for exponential families. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, 2924–2934. Curran Associates Inc., Red Hook, NY, USA.

Bernstein

, Sheldon

(2019). Differentially private bayesian linear regression. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems, 525–535. Curran Associates Inc., Red Hook, NY, USA.

Chanyaswad

, Dytso

, Poor

, Mittal

(2018). MVG mechanism: Differential privacy under matrix-valued query. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, 230–246. Association for Computing Machinery, New York, NY, USA.

Chaudhuri

, Sarwate

, Sinha

(2013). A near-optimal algorithm for differentially-private principal components. Journal of Machine Learning Research, 14(1): 2905–2943.

Cook

, Li

, Chiaromonte

(2010). Envelope models for parsimonious and efficient multivariate linear regression. Statistica Sinica, 20(3): 927–1010.

Cook

, Zhang

(2015). Foundations for envelope models and methods. Journal of the American Statistical Association, 110(510): 599–611. https://doi.org/10.1080/01621459.2014.983235

Dandekar

, Basu

, Bressan

(2018). Differential privacy for regularised linear regression. In: International Conference on Database and Expert Systems Applications, 483–491. Springer.

Doe

, Roe

(2021). Differential privacy techniques for census data analysis. Journal of Census and Demographic Analysis, 15(2): 123–137.

Dwork

, Kenthapadi

, McSherry

, Mironov

, Naor

(2006a). Our data, ourselves: Privacy via distributed noise generation. In: Advances in Cryptology-EUROCRYPT 2006: 24th Annual International Conference on the Theory and Applications of Cryptographic Techniques. Proceedings 25. St. Petersburg, Russia, May 28–June 1, 2006, 486–503. Springer.

Dwork

, McSherry

, Nissim

, Smith

(2006b). Calibrating noise to sensitivity in private data analysis. In: Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006. Proceedings 3. New York, NY, USA, March 4–7, 2006, 265–284. Springer.

Dwork

, Roth

(2014). The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical Computer Science, 9(3–4): 211–407.

Dyda

, Purcell

, Curtis

, Field

, Pillai

, Ricardo

, et al. (2021). Differential privacy for public health data: An innovative tool to optimize information sharing while protecting data confidentiality. Patterns, 2(12). https://doi.org/10.1016/j.patter.2021.100366

Frühwirth-Schnatter

(2006). Finite Mixture and Markov Switching Models. Springer.

, Awan

, Gong

, Rao

(2022). Data augmentation MCMC for Bayesian inference from privatized data. Advances in Neural Information Processing Systems, 35: 12732–12743.

McSherry

, Mironov

(2009). Differentially private recommender systems: Building privacy into the Netflix prize contenders. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 627–636. Association for Computing Machinery, New York, NY, USA.

Smith

(2008). Efficient, differentially private point estimators. arXiv preprint: https://arxiv.org/abs/0809.4794.

Talwar

, Thakurta

, Zhang

(2015). Nearly-optimal private lasso. In: Proceedings of the 29th International Conference on Neural Information Processing Systems, 3025–3033. MIT Press, Cambridge, MA, USA.

Tierney

(1994). Markov chains for exploring posterior distributions. The Annals of Statistics, 22(4): 1701–1728.

Wang

, Xu

(2019). On sparse linear regression in the local differential privacy model. In: International Conference on Machine Learning, 6628–6637. PMLR.

Yao

, Li

(2018). Differential privacy with bias-control limited sources. IEEE Transactions on Information Forensics and Security, 13(5): 1230–1241. https://doi.org/10.1109/TIFS.2017.2780802

Zhang

, Rubinstein

BIP

, Dimitrakakis

(2016). On the differential privacy of bayesian inference. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2365–2371. AAAI Press.