<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">JDS</journal-id>
<journal-title-group><journal-title>Journal of Data Science</journal-title></journal-title-group>
<issn pub-type="epub">1683-8602</issn><issn pub-type="ppub">1680-743X</issn><issn-l>1680-743X</issn-l>
<publisher>
<publisher-name>School of Statistics, Renmin University of China</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">JDS1194</article-id>
<article-id pub-id-type="doi">10.6339/25-JDS1194</article-id>
<article-categories><subj-group subj-group-type="heading">
<subject>Statistical Data Science</subject></subj-group></article-categories>
<title-group>
<article-title>Differentially Private Bayesian Envelope Regression via Sufficient Statistic Perturbation</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Yu</surname><given-names>Peng</given-names></name><xref ref-type="aff" rid="j_jds1194_aff_001">1</xref><xref ref-type="fn" rid="j_jds1194_fn_001">†</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Jiang</surname><given-names>Yangdi</given-names></name><xref ref-type="aff" rid="j_jds1194_aff_001">1</xref><xref ref-type="fn" rid="j_jds1194_fn_001">†</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Su</surname><given-names>Zhihua</given-names></name><xref ref-type="aff" rid="j_jds1194_aff_002">2</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Wu</surname><given-names>Jiamei</given-names></name><xref ref-type="aff" rid="j_jds1194_aff_003">3</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Kong</surname><given-names>Lingchen</given-names></name><xref ref-type="aff" rid="j_jds1194_aff_003">3</xref>
</contrib>
<contrib contrib-type="author">
<contrib-id contrib-id-type="orcid">https://orcid.org/0000-0002-0033-839X</contrib-id>
<name><surname>Jiang</surname><given-names>Bei</given-names></name><email xlink:href="mailto:bei1@ualberta.ca">bei1@ualberta.ca</email><xref ref-type="aff" rid="j_jds1194_aff_001">1</xref><xref ref-type="corresp" rid="cor1">∗</xref>
</contrib>
<aff id="j_jds1194_aff_001"><label>1</label><institution>Department of Mathematical and Statistical Sciences, University of Alberta</institution>, Edmonton, <country>Canada</country></aff>
<aff id="j_jds1194_aff_002"><label>2</label><institution>nVerses Capital</institution>, Wellington, FL, <country>USA</country></aff>
<aff id="j_jds1194_aff_003"><label>3</label><institution>School of Mathematics and Statistics, Beijing Jiaotong University</institution>, Beijing, <country>China</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>∗</label>Corresponding author. Email: <ext-link ext-link-type="uri" xlink:href="mailto:bei1@ualberta.ca">bei1@ualberta.ca</ext-link>.</corresp><fn id="j_jds1194_fn_001"><label>†</label>
<p>These two authors contributed equally to this paper.</p></fn>
</author-notes>
<pub-date pub-type="ppub"><year>2026</year></pub-date><pub-date pub-type="epub"><day>3</day><month>10</month><year>2025</year></pub-date><volume>24</volume><issue>1</issue><fpage>187</fpage><lpage>202</lpage><supplementary-material id="S1" content-type="archive" xlink:href="jds1194_s001.zip" mimetype="application" mime-subtype="x-zip-compressed">
<caption>
<title>Supplementary Material</title>
<p>A compressed folder containing the code used to generate the results in Section 4 and to implement our proposed methods is available online.</p>
</caption>
</supplementary-material><history><date date-type="received"><day>28</day><month>12</month><year>2024</year></date><date date-type="accepted"><day>24</day><month>6</month><year>2025</year></date></history>
<permissions><copyright-statement>2026 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.</copyright-statement><copyright-year>2026</copyright-year>
<license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>We propose a differentially private Bayesian framework for envelope regression, a technique that improves estimation efficiency by modelling the response as a function of a low-dimensional subspace of the predictors. Our method applies the analytic Gaussian mechanism to privatize sufficient statistics from the data, ensuring formal <inline-formula id="j_jds1194_ineq_001"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">ϵ</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">δ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$(\epsilon ,\delta )$]]></tex-math></alternatives></inline-formula>-differential privacy. We develop a tailored Gibbs sampling algorithm that performs valid Bayesian inference using only the noisy sufficient statistics. This approach leverages the envelope structure to isolate the variation in predictors that is relevant to the response, reducing estimation error compared to standard regression under the same privacy constraints. Through simulation studies, we demonstrate improved estimation accuracy and tighter credible intervals relative to a differentially private Bayesian linear regression baseline.</p>
</abstract>
<kwd-group>
<label>Keywords</label>
<kwd>credible interval</kwd>
<kwd>dimension reduction</kwd>
<kwd>MCMC</kwd>
<kwd>statistical inference</kwd>
</kwd-group>
<funding-group><funding-statement>The research received funding from the Canada CIFAR AI Chairs program, the Alberta Machine Intelligence Institute, the Natural Sciences and Engineering Council of Canada, and the Canadian Statistical Sciences Institute.</funding-statement></funding-group>
</article-meta>
</front>
<back>
<ref-list id="j_jds1194_reflist_001">
<title>References</title>
<ref id="j_jds1194_ref_001">
<mixed-citation publication-type="journal"> <string-name><surname>Aoshima</surname> <given-names>M</given-names></string-name>, <string-name><surname>Shen</surname> <given-names>D</given-names></string-name>, <string-name><surname>Shen</surname> <given-names>H</given-names></string-name>, <string-name><surname>Yata</surname> <given-names>K</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>YH</given-names></string-name>, <string-name><surname>Marron</surname> <given-names>JS</given-names></string-name> (<year>2018</year>). <article-title>A survey of high dimension low sample size asymptotics</article-title>. <source><italic>Australian &amp; New Zealand Journal of Statistics</italic></source>, <volume>60</volume>: <fpage>4</fpage>–<lpage>19</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1111/anzs.12212" xlink:type="simple">https://doi.org/10.1111/anzs.12212</ext-link></mixed-citation>
</ref>
<ref id="j_jds1194_ref_002">
<mixed-citation publication-type="journal"> <string-name><surname>Aoshima</surname> <given-names>M</given-names></string-name>, <string-name><surname>Yata</surname> <given-names>K</given-names></string-name> (<year>2017</year>). <article-title>Statistical inference for high-dimension, low-sample-size data</article-title>. <source><italic>American Mathematical Society, Sugaku Expositions</italic></source>, <volume>30</volume>: <fpage>137</fpage>–<lpage>158</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1090/suga/421" xlink:type="simple">https://doi.org/10.1090/suga/421</ext-link></mixed-citation>
</ref>
<ref id="j_jds1194_ref_003">
<mixed-citation publication-type="chapter"> <string-name><surname>Balle</surname> <given-names>B</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>YX</given-names></string-name> (<year>2018</year>). <chapter-title>Improving the Gaussian mechanism for differential privacy: Analytical calibration and optimal denoising</chapter-title>. In: <source><italic>International Conference on Machine Learning</italic></source>, <fpage>394</fpage>–<lpage>403</lpage>. <publisher-name>PMLR</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1194_ref_004">
<mixed-citation publication-type="chapter"> <string-name><surname>Bernstein</surname> <given-names>G</given-names></string-name>, <string-name><surname>Sheldon</surname> <given-names>D</given-names></string-name> (<year>2018</year>). <chapter-title>Differentially private Bayesian inference for exponential families</chapter-title>. In: <source><italic>Proceedings of the 32nd International Conference on Neural Information Processing Systems</italic></source>, <fpage>2924</fpage>–<lpage>2934</lpage>. <publisher-name>Curran Associates Inc.</publisher-name>, <publisher-loc>Red Hook, NY, USA</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1194_ref_005">
<mixed-citation publication-type="chapter"> <string-name><surname>Bernstein</surname> <given-names>G</given-names></string-name>, <string-name><surname>Sheldon</surname> <given-names>D</given-names></string-name> (<year>2019</year>). <chapter-title>Differentially private bayesian linear regression</chapter-title>. In: <source><italic>Proceedings of the 33rd International Conference on Neural Information Processing Systems</italic></source>, <fpage>525</fpage>–<lpage>535</lpage>. <publisher-name>Curran Associates Inc.</publisher-name>, <publisher-loc>Red Hook, NY, USA</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1194_ref_006">
<mixed-citation publication-type="chapter"> <string-name><surname>Chanyaswad</surname> <given-names>T</given-names></string-name>, <string-name><surname>Dytso</surname> <given-names>A</given-names></string-name>, <string-name><surname>Poor</surname> <given-names>HV</given-names></string-name>, <string-name><surname>Mittal</surname> <given-names>P</given-names></string-name> (<year>2018</year>). <chapter-title>MVG mechanism: Differential privacy under matrix-valued query</chapter-title>. In: <source><italic>Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security</italic></source>, <fpage>230</fpage>–<lpage>246</lpage>. <publisher-name>Association for Computing Machinery</publisher-name>, <publisher-loc>New York, NY, USA</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1194_ref_007">
<mixed-citation publication-type="journal"> <string-name><surname>Chaudhuri</surname> <given-names>K</given-names></string-name>, <string-name><surname>Sarwate</surname> <given-names>AD</given-names></string-name>, <string-name><surname>Sinha</surname> <given-names>K</given-names></string-name> (<year>2013</year>). <article-title>A near-optimal algorithm for differentially-private principal components</article-title>. <source><italic>Journal of Machine Learning Research</italic></source>, <volume>14</volume>(<issue>1</issue>): <fpage>2905</fpage>–<lpage>2943</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1194_ref_008">
<mixed-citation publication-type="journal"> <string-name><surname>Cook</surname> <given-names>RD</given-names></string-name>, <string-name><surname>Li</surname> <given-names>B</given-names></string-name>, <string-name><surname>Chiaromonte</surname> <given-names>F</given-names></string-name> (<year>2010</year>). <article-title>Envelope models for parsimonious and efficient multivariate linear regression</article-title>. <source><italic>Statistica Sinica</italic></source>, <volume>20</volume>(<issue>3</issue>): <fpage>927</fpage>–<lpage>1010</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1194_ref_009">
<mixed-citation publication-type="journal"> <string-name><surname>Cook</surname> <given-names>RD</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>X</given-names></string-name> (<year>2015</year>). <article-title>Foundations for envelope models and methods</article-title>. <source><italic>Journal of the American Statistical Association</italic></source>, <volume>110</volume>(<issue>510</issue>): <fpage>599</fpage>–<lpage>611</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1080/01621459.2014.983235" xlink:type="simple">https://doi.org/10.1080/01621459.2014.983235</ext-link></mixed-citation>
</ref>
<ref id="j_jds1194_ref_010">
<mixed-citation publication-type="chapter"> <string-name><surname>Dandekar</surname> <given-names>A</given-names></string-name>, <string-name><surname>Basu</surname> <given-names>D</given-names></string-name>, <string-name><surname>Bressan</surname> <given-names>S</given-names></string-name> (<year>2018</year>). <chapter-title>Differential privacy for regularised linear regression</chapter-title>. In: <source><italic>International Conference on Database and Expert Systems Applications</italic></source>, <fpage>483</fpage>–<lpage>491</lpage>. <publisher-name>Springer</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1194_ref_011">
<mixed-citation publication-type="journal"> <string-name><surname>Doe</surname> <given-names>J</given-names></string-name>, <string-name><surname>Roe</surname> <given-names>J</given-names></string-name> (<year>2021</year>). <article-title>Differential privacy techniques for census data analysis</article-title>. <source><italic>Journal of Census and Demographic Analysis</italic></source>, <volume>15</volume>(<issue>2</issue>): <fpage>123</fpage>–<lpage>137</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1194_ref_012">
<mixed-citation publication-type="chapter"> <string-name><surname>Dwork</surname> <given-names>C</given-names></string-name>, <string-name><surname>Kenthapadi</surname> <given-names>K</given-names></string-name>, <string-name><surname>McSherry</surname> <given-names>F</given-names></string-name>, <string-name><surname>Mironov</surname> <given-names>I</given-names></string-name>, <string-name><surname>Naor</surname> <given-names>M</given-names></string-name> (<year>2006</year>a). <chapter-title>Our data, ourselves: Privacy via distributed noise generation</chapter-title>. In: <source><italic>Advances in Cryptology-EUROCRYPT 2006: 24th Annual International Conference on the Theory and Applications of Cryptographic Techniques</italic></source>. <series><italic>Proceedings</italic></series> <volume>25</volume>. <conf-loc>St. Petersburg, Russia</conf-loc>, <conf-date>May 28–June 1, 2006</conf-date>, <fpage>486</fpage>–<lpage>503</lpage>. <publisher-name>Springer</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1194_ref_013">
<mixed-citation publication-type="chapter"> <string-name><surname>Dwork</surname> <given-names>C</given-names></string-name>, <string-name><surname>McSherry</surname> <given-names>F</given-names></string-name>, <string-name><surname>Nissim</surname> <given-names>K</given-names></string-name>, <string-name><surname>Smith</surname> <given-names>A</given-names></string-name> (<year>2006</year>b). <chapter-title>Calibrating noise to sensitivity in private data analysis</chapter-title>. In: <source><italic>Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006. Proceedings 3</italic></source>. <conf-loc>New York, NY, USA</conf-loc>, <conf-date>March 4–7, 2006</conf-date>, <fpage>265</fpage>–<lpage>284</lpage>. <publisher-name>Springer</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1194_ref_014">
<mixed-citation publication-type="journal"> <string-name><surname>Dwork</surname> <given-names>C</given-names></string-name>, <string-name><surname>Roth</surname> <given-names>A</given-names></string-name> (<year>2014</year>). <article-title>The algorithmic foundations of differential privacy</article-title>. <source><italic>Foundations and Trends in Theoretical Computer Science</italic></source>, <volume>9</volume>(<issue>3–4</issue>): <fpage>211</fpage>–<lpage>407</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1194_ref_015">
<mixed-citation publication-type="journal"> <string-name><surname>Dyda</surname> <given-names>A</given-names></string-name>, <string-name><surname>Purcell</surname> <given-names>M</given-names></string-name>, <string-name><surname>Curtis</surname> <given-names>S</given-names></string-name>, <string-name><surname>Field</surname> <given-names>E</given-names></string-name>, <string-name><surname>Pillai</surname> <given-names>P</given-names></string-name>, <string-name><surname>Ricardo</surname> <given-names>K</given-names></string-name>, <etal>et al.</etal> (<year>2021</year>). <article-title>Differential privacy for public health data: An innovative tool to optimize information sharing while protecting data confidentiality</article-title>. <source><italic>Patterns</italic></source>, <volume>2</volume>(<issue>12</issue>). <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1016/j.patter.2021.100366" xlink:type="simple">https://doi.org/10.1016/j.patter.2021.100366</ext-link></mixed-citation>
</ref>
<ref id="j_jds1194_ref_016">
<mixed-citation publication-type="book"> <string-name><surname>Frühwirth-Schnatter</surname> <given-names>S</given-names></string-name> (<year>2006</year>). <source><italic>Finite Mixture and Markov Switching Models</italic></source>. <publisher-name>Springer</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1194_ref_017">
<mixed-citation publication-type="journal"> <string-name><surname>Ju</surname> <given-names>N</given-names></string-name>, <string-name><surname>Awan</surname> <given-names>J</given-names></string-name>, <string-name><surname>Gong</surname> <given-names>R</given-names></string-name>, <string-name><surname>Rao</surname> <given-names>V</given-names></string-name> (<year>2022</year>). <article-title>Data augmentation MCMC for Bayesian inference from privatized data</article-title>. <source><italic>Advances in Neural Information Processing Systems</italic></source>, <volume>35</volume>: <fpage>12732</fpage>–<lpage>12743</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1194_ref_018">
<mixed-citation publication-type="chapter"> <string-name><surname>McSherry</surname> <given-names>F</given-names></string-name>, <string-name><surname>Mironov</surname> <given-names>I</given-names></string-name> (<year>2009</year>). <chapter-title>Differentially private recommender systems: Building privacy into the Netflix prize contenders</chapter-title>. In: <source><italic>Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</italic></source>, <fpage>627</fpage>–<lpage>636</lpage>. <publisher-name>Association for Computing Machinery</publisher-name>, <publisher-loc>New York, NY, USA</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1194_ref_019">
<mixed-citation publication-type="other"> <string-name><surname>Smith</surname> <given-names>A</given-names></string-name> (<year>2008</year>). Efficient, differentially private point estimators. arXiv preprint: <uri>https://arxiv.org/abs/0809.4794</uri>.</mixed-citation>
</ref>
<ref id="j_jds1194_ref_020">
<mixed-citation publication-type="chapter"> <string-name><surname>Talwar</surname> <given-names>K</given-names></string-name>, <string-name><surname>Thakurta</surname> <given-names>A</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>L</given-names></string-name> (<year>2015</year>). <chapter-title>Nearly-optimal private lasso</chapter-title>. In: <source><italic>Proceedings of the 29th International Conference on Neural Information Processing Systems</italic></source>, <fpage>3025</fpage>–<lpage>3033</lpage>. <publisher-name>MIT Press</publisher-name>, <publisher-loc>Cambridge, MA, USA</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1194_ref_021">
<mixed-citation publication-type="journal"> <string-name><surname>Tierney</surname> <given-names>L</given-names></string-name> (<year>1994</year>). <article-title>Markov chains for exploring posterior distributions</article-title>. <source><italic>The Annals of Statistics</italic></source>, <volume>22</volume>(<issue>4</issue>): <fpage>1701</fpage>–<lpage>1728</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1194_ref_022">
<mixed-citation publication-type="chapter"> <string-name><surname>Wang</surname> <given-names>D</given-names></string-name>, <string-name><surname>Xu</surname> <given-names>J</given-names></string-name> (<year>2019</year>). <chapter-title>On sparse linear regression in the local differential privacy model</chapter-title>. In: <source><italic>International Conference on Machine Learning</italic></source>, <fpage>6628</fpage>–<lpage>6637</lpage>. <publisher-name>PMLR</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1194_ref_023">
<mixed-citation publication-type="journal"> <string-name><surname>Yao</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Li</surname> <given-names>Z</given-names></string-name> (<year>2018</year>). <article-title>Differential privacy with bias-control limited sources</article-title>. <source><italic>IEEE Transactions on Information Forensics and Security</italic></source>, <volume>13</volume>(<issue>5</issue>): <fpage>1230</fpage>–<lpage>1241</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1109/TIFS.2017.2780802" xlink:type="simple">https://doi.org/10.1109/TIFS.2017.2780802</ext-link></mixed-citation>
</ref>
<ref id="j_jds1194_ref_024">
<mixed-citation publication-type="chapter"> <string-name><surname>Zhang</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Rubinstein</surname> <given-names>BIP</given-names></string-name>, <string-name><surname>Dimitrakakis</surname> <given-names>C</given-names></string-name> (<year>2016</year>). <chapter-title>On the differential privacy of bayesian inference</chapter-title>. In: <source><italic>Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence</italic></source>, <fpage>2365</fpage>–<lpage>2371</lpage>. <publisher-name>AAAI Press</publisher-name>.</mixed-citation>
</ref>
</ref-list>
</back>
</article>
