<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">JDS</journal-id>
<journal-title-group><journal-title>Journal of Data Science</journal-title></journal-title-group>
<issn pub-type="epub">1683-8602</issn><issn pub-type="ppub">1680-743X</issn><issn-l>1680-743X</issn-l>
<publisher>
<publisher-name>School of Statistics, Renmin University of China</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">JDS1044</article-id>
<article-id pub-id-type="doi">10.6339/22-JDS1044</article-id>
<article-categories><subj-group subj-group-type="heading">
<subject>Statistical Data Science</subject></subj-group></article-categories>
<title-group>
<article-title>Inference for Optimal Differential Privacy Procedures for Frequency Tables</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Li</surname><given-names>Chengcheng</given-names></name><xref ref-type="aff" rid="j_jds1044_aff_001">1</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Wang</surname><given-names>Naisyin</given-names></name><xref ref-type="aff" rid="j_jds1044_aff_001">1</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Xu</surname><given-names>Gongjun</given-names></name><email xlink:href="mailto:gongjun@umich.edu">gongjun@umich.edu</email><xref ref-type="aff" rid="j_jds1044_aff_001">1</xref><xref ref-type="corresp" rid="cor1">∗</xref>
</contrib>
<aff id="j_jds1044_aff_001"><label>1</label>Department of Statistics, <institution>University of Michigan</institution>, Ann Arbor, MI 48109, <country>USA</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>∗</label>Corresponding author Email: <ext-link ext-link-type="uri" xlink:href="mailto:gongjun@umich.edu">gongjun@umich.edu</ext-link>.</corresp>
</author-notes>
<pub-date pub-type="ppub"><year>2022</year></pub-date><pub-date pub-type="epub"><day>20</day><month>4</month><year>2022</year></pub-date><volume>20</volume><issue>2</issue><fpage>253</fpage><lpage>276</lpage><supplementary-material id="S1" content-type="archive" xlink:href="jds1044_s001.zip" mimetype="application" mime-subtype="x-zip-compressed">
<caption>
<title>Supplementary Material</title>
<p>Supplementary Material available online includes proofs of theoretical results and additional simulation study results on inter- and intra-table merging.</p>
</caption>
</supplementary-material><history><date date-type="received"><day>28</day><month>3</month><year>2022</year></date><date date-type="accepted"><day>30</day><month>3</month><year>2022</year></date></history>
<permissions><copyright-statement>2022 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.</copyright-statement><copyright-year>2022</copyright-year>
<license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>When releasing data to the public, a vital concern is the risk of exposing personal information of the individuals who have contributed to the data set. Many mechanisms have been proposed to protect individual privacy, though less attention has been dedicated to practically conducting valid inferences on the altered privacy-protected data sets. For frequency tables, the privacy-protection-oriented perturbations often lead to negative cell counts. Releasing such tables can undermine users’ confidence in the usefulness of such data sets. This paper focuses on releasing one-way frequency tables. We recommend an optimal mechanism that satisfies <italic>ϵ</italic>-differential privacy (DP) without suffering from having negative cell counts. The procedure is optimal in the sense that the expected utility is maximized under a given privacy constraint. Valid inference procedures for testing goodness-of-fit are also developed for the DP privacy-protected data. In particular, we propose a de-biased test statistic for the optimal procedure and derive its asymptotic distribution. In addition, we also introduce testing procedures for the commonly used Laplace and Gaussian mechanisms, which provide a good finite sample approximation for the null distributions. Moreover, the decaying rate requirements for the privacy regime are provided for the inference procedures to be valid. We further consider common users’ practices such as merging related or neighboring cells or integrating statistical information obtained across different data sources and derive valid testing procedures when these operations occur. Simulation studies show that our inference results hold well even when the sample size is relatively small. Comparisons with the current field standards, including the Laplace, the Gaussian (both with/without post-processing of replacing negative cell counts with zeros), and the Binomial-Beta McClure-Reiter mechanisms, are carried out. In the end, we apply our method to the National Center for Early Development and Learning’s (NCEDL) multi-state studies data to demonstrate its practical applicability.</p>
</abstract>
<kwd-group>
<label>Keywords</label>
<kwd>goodness-of-fit</kwd>
<kwd>hypothesis testing</kwd>
<kwd>optimality</kwd>
<kwd>table merging</kwd>
</kwd-group>
<funding-group><funding-statement>This research was partially supported by NSF SES-1846747.</funding-statement></funding-group>
</article-meta>
</front>
<body/>
<back>
<ref-list id="j_jds1044_reflist_001">
<title>References</title>
<ref id="j_jds1044_ref_001">
<mixed-citation publication-type="chapter"> <string-name><surname>Abowd</surname> <given-names>JM</given-names></string-name>, <string-name><surname>Vilhuber</surname> <given-names>L</given-names></string-name> (<year>2008</year>). <chapter-title>How protective are synthetic data?</chapter-title> In: <source>International Conference on Privacy in Statistical Databases</source>, <fpage>239</fpage>–<lpage>246</lpage>. <publisher-name>Springer</publisher-name>, <publisher-loc>New York, U.S</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_002">
<mixed-citation publication-type="journal"> <string-name><surname>Avella-Medina</surname> <given-names>M</given-names></string-name> (<year>2021</year>). <article-title>Privacy-preserving parametric inference: A case for robust statistics</article-title>. <source>Journal of the American Statistical Association</source>, <volume>116</volume>(<issue>534</issue>): <fpage>969</fpage>–<lpage>983</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_003">
<mixed-citation publication-type="chapter"> <string-name><surname>Awan</surname> <given-names>J</given-names></string-name>, <string-name><surname>Slavković</surname> <given-names>A</given-names></string-name> (<year>2018</year>). <chapter-title>Differentially private uniformly most powerful tests for binomial data</chapter-title>. In: <source>Advances in Neural Information Processing Systems</source> (<string-name><given-names>S</given-names> <surname>Bengio</surname></string-name>, <string-name><given-names>H</given-names> <surname>Wallach</surname></string-name>, <string-name><given-names>H</given-names> <surname>Larochelle</surname></string-name>, <string-name><given-names>K</given-names> <surname>Grauman</surname></string-name>, <string-name><given-names>N</given-names> <surname>Cesa-Bianchi</surname></string-name>, <string-name><given-names>R</given-names> <surname>Garnett</surname></string-name>, eds.), volume <volume>31</volume> of <series><italic>Curran Associates</italic></series>. <publisher-name>Inc.</publisher-name>, <publisher-loc>New York, U.S</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_004">
<mixed-citation publication-type="journal"> <string-name><surname>Barrientos</surname> <given-names>AF</given-names></string-name>, <string-name><surname>Reiter</surname> <given-names>JP</given-names></string-name>, <string-name><surname>Machanavajjhala</surname> <given-names>A</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>Y</given-names></string-name> (<year>2019</year>). <article-title>Differentially private significance tests for regression coefficients</article-title>. <source>Journal of Computational and Graphical Statistics</source>, <volume>28</volume>(<issue>2</issue>): <fpage>440</fpage>–<lpage>453</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_005">
<mixed-citation publication-type="journal"> <string-name><surname>Bowen</surname> <given-names>CM</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>F</given-names></string-name> (<year>2020</year>). <article-title>Comparative study of differentially private data synthesis methods</article-title>. <source>Statistical Science</source>, <volume>35</volume>(<issue>2</issue>): <fpage>280</fpage>–<lpage>307</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_006">
<mixed-citation publication-type="chapter"> <string-name><surname>Campbell</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Bray</surname> <given-names>A</given-names></string-name>, <string-name><surname>Ritz</surname> <given-names>A</given-names></string-name>, <string-name><surname>Groce</surname> <given-names>A</given-names></string-name> (<year>2018</year>). <chapter-title>Differentially private ANOVA testing</chapter-title>. In: <source>2018 1st International Conference on Data Intelligence and Security (ICDIS)</source>, <fpage>281</fpage>–<lpage>285</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_007">
<mixed-citation publication-type="journal"> <string-name><surname>Canonne</surname> <given-names>CL</given-names></string-name>, <string-name><surname>Kamath</surname> <given-names>G</given-names></string-name>, <string-name><surname>Steinke</surname> <given-names>T</given-names></string-name> (<year>2020</year>). <article-title>The discrete gaussian for differential privacy</article-title>. <source>Advances in Neural Information Processing Systems</source>, <volume>33</volume>: <fpage>15676</fpage>–<lpage>15688</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_008">
<mixed-citation publication-type="journal"> <string-name><surname>Charest</surname> <given-names>AS</given-names></string-name> (<year>2011</year>). <article-title>How can we analyze differentially private synthetic datasets?</article-title> <source>Journal of Privacy and Confidentiality</source>, <volume>2</volume>(<issue>2</issue>).</mixed-citation>
</ref>
<ref id="j_jds1044_ref_009">
<mixed-citation publication-type="chapter"> <string-name><surname>Chaudhuri</surname> <given-names>K</given-names></string-name>, <string-name><surname>Sarwate</surname> <given-names>A</given-names></string-name>, <string-name><surname>Sinha</surname> <given-names>K</given-names></string-name> (<year>2012</year>). <chapter-title>Near-optimal differentially private principal components</chapter-title>. In: <source>Advances in Neural Information Processing Systems</source>, <fpage>989</fpage>–<lpage>997</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_010">
<mixed-citation publication-type="chapter"> <string-name><surname>Couch</surname> <given-names>S</given-names></string-name>, <string-name><surname>Kazan</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Shi</surname> <given-names>K</given-names></string-name>, <string-name><surname>Bray</surname> <given-names>A</given-names></string-name>, <string-name><surname>Groce</surname> <given-names>A</given-names></string-name> (<year>2019</year>). <chapter-title>Differentially private nonparametric hypothesis testing</chapter-title>. In: <source>Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security</source>, <fpage>737</fpage>–<lpage>751</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_011">
<mixed-citation publication-type="chapter"> <string-name><surname>Degue</surname> <given-names>KH</given-names></string-name>, <string-name><surname>Le Ny</surname> <given-names>J</given-names></string-name> (<year>2018</year>). <chapter-title>On differentially private Gaussian hypothesis testing</chapter-title>. In: <source>2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton)</source>, <fpage>842</fpage>–<lpage>847</lpage>. <publisher-name>IEEE</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_012">
<mixed-citation publication-type="chapter"> <string-name><surname>Ding</surname> <given-names>B</given-names></string-name>, <string-name><surname>Nori</surname> <given-names>H</given-names></string-name>, <string-name><surname>Li</surname> <given-names>P</given-names></string-name>, <string-name><surname>Allen</surname> <given-names>J</given-names></string-name> (<year>2018</year>). <chapter-title>Comparing population means under local differential privacy: with significance and power</chapter-title>. In: <source>Proceedings of the AAAI Conference on Artificial Intelligence</source>, volume <volume>32</volume>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_013">
<mixed-citation publication-type="book"> <string-name><surname>Drechsler</surname> <given-names>J</given-names></string-name> (<year>2011</year>). <source>Synthetic Datasets for Statistical Disclosure Control: Theory and Implementation</source>, volume <volume>201</volume>. <publisher-name>Springer Science &amp; Business Media</publisher-name>, <publisher-loc>Berlin, Germany</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_014">
<mixed-citation publication-type="chapter"> <string-name><surname>Dwork</surname> <given-names>C</given-names></string-name>, <string-name><surname>Kenthapadi</surname> <given-names>K</given-names></string-name>, <string-name><surname>McSherry</surname> <given-names>F</given-names></string-name>, <string-name><surname>Mironov</surname> <given-names>I</given-names></string-name>, <string-name><surname>Naor</surname> <given-names>M</given-names></string-name> (<year>2006</year>a). <chapter-title>Our data, ourselves: privacy via distributed noise generation</chapter-title>. In: <source>Annual International Conference on the Theory and Applications of Cryptographic Techniques</source>, <fpage>486</fpage>–<lpage>503</lpage>. <publisher-name>Springer</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_015">
<mixed-citation publication-type="chapter"> <string-name><surname>Dwork</surname> <given-names>C</given-names></string-name>, <string-name><surname>McSherry</surname> <given-names>F</given-names></string-name>, <string-name><surname>Nissim</surname> <given-names>K</given-names></string-name>, <string-name><surname>Smith</surname> <given-names>A</given-names></string-name> (<year>2006</year>b). <chapter-title>Calibrating noise to sensitivity in private data analysis</chapter-title>. In: <source>Theory of Cryptography Conference</source>, <fpage>265</fpage>–<lpage>284</lpage>. <publisher-name>Springer</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_016">
<mixed-citation publication-type="chapter"> <string-name><surname>Dwork</surname> <given-names>C</given-names></string-name>, <string-name><surname>Naor</surname> <given-names>M</given-names></string-name>, <string-name><surname>Pitassi</surname> <given-names>T</given-names></string-name>, <string-name><surname>Rothblum</surname> <given-names>GN</given-names></string-name>, <string-name><surname>Yekhanin</surname> <given-names>S</given-names></string-name> (<year>2010</year>). <chapter-title>Pan-private streaming algorithms</chapter-title>. In: <source>ICS</source>, <fpage>66</fpage>–<lpage>80</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_017">
<mixed-citation publication-type="journal"> <string-name><surname>Dwork</surname> <given-names>C</given-names></string-name>, <string-name><surname>Roth</surname> <given-names>A</given-names></string-name> (<year>2014</year>). <article-title>The algorithmic foundations of differential privacy</article-title>. <source>Foundations and Trends in Theoretical Computer Science</source>, <volume>9</volume>(<issue>3–4</issue>): <fpage>211</fpage>–<lpage>407</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_018">
<mixed-citation publication-type="other"> <string-name><surname>Ferrando</surname> <given-names>C</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>S</given-names></string-name>, <string-name><surname>Sheldon</surname> <given-names>D</given-names></string-name> (2020). Parametric bootstrap for differentially private confidence intervals. arXiv preprint: <uri>https://arxiv.org/abs/2006.07749</uri>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_019">
<mixed-citation publication-type="journal"> <string-name><surname>Friedman</surname> <given-names>A</given-names></string-name>, <string-name><surname>Berkovsky</surname> <given-names>S</given-names></string-name>, <string-name><surname>Kaafar</surname> <given-names>MA</given-names></string-name> (<year>2016</year>). <article-title>A differential privacy framework for matrix factorization recommender systems</article-title>. <source>User Modeling and User-Adapted Interaction</source>, <volume>26</volume>(<issue>5</issue>): <fpage>425</fpage>–<lpage>458</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_020">
<mixed-citation publication-type="chapter"> <string-name><surname>Gaboardi</surname> <given-names>M</given-names></string-name>, <string-name><surname>Lim</surname> <given-names>H</given-names></string-name>, <string-name><surname>Rogers</surname> <given-names>R</given-names></string-name>, <string-name><surname>Vadhan</surname> <given-names>S</given-names></string-name> (<year>2016</year>). <chapter-title>Differentially private Chi-squared hypothesis testing: Goodness of fit and independence testing</chapter-title>. In: <source>International Conference on Machine Learning</source>, <fpage>2111</fpage>–<lpage>2120</lpage>. <publisher-name>PMLR</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_021">
<mixed-citation publication-type="chapter"> <string-name><surname>Geng</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Viswanath</surname> <given-names>P</given-names></string-name> (<year>2014</year>). <chapter-title>The optimal mechanism in differential privacy</chapter-title>. In: <source>2014 IEEE International Symposium on Information Theory</source>, <fpage>2371</fpage>–<lpage>2375</lpage>. <publisher-name>IEEE</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_022">
<mixed-citation publication-type="journal"> <string-name><surname>Geng</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Viswanath</surname> <given-names>P</given-names></string-name> (<year>2015</year>). <article-title>The optimal noise-adding mechanism in differential privacy</article-title>. <source>IEEE Transactions on Information Theory</source>, <volume>62</volume>(<issue>2</issue>): <fpage>925</fpage>–<lpage>951</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_023">
<mixed-citation publication-type="journal"> <string-name><surname>Ghosh</surname> <given-names>A</given-names></string-name>, <string-name><surname>Roughgarden</surname> <given-names>T</given-names></string-name>, <string-name><surname>Sundararajan</surname> <given-names>M</given-names></string-name> (<year>2012</year>). <article-title>Universally utility-maximizing privacy mechanisms</article-title>. <source>SIAM Journal on Computing</source>, <volume>41</volume>(<issue>6</issue>): <fpage>1673</fpage>–<lpage>1693</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_024">
<mixed-citation publication-type="chapter"> <string-name><surname>Golle</surname> <given-names>P</given-names></string-name>, <string-name><surname>Partridge</surname> <given-names>K</given-names></string-name> (<year>2009</year>). <chapter-title>On the anonymity of home/work location pairs</chapter-title>. In: <source>Pervasive Computing</source>, <fpage>390</fpage>–<lpage>397</lpage>. <publisher-name>Springer</publisher-name>, <publisher-loc>Berlin Heidelberg, Berlin, Heidelberg</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_025">
<mixed-citation publication-type="chapter"> <string-name><surname>Hay</surname> <given-names>M</given-names></string-name>, <string-name><surname>Machanavajjhala</surname> <given-names>A</given-names></string-name>, <string-name><surname>Miklau</surname> <given-names>G</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>D</given-names></string-name> (<year>2016</year>). <chapter-title>Principled evaluation of differentially private algorithms using dpbench</chapter-title>. In: <source>Proceedings of the 2016 International Conference on Management of Data</source>, <fpage>139</fpage>–<lpage>154</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_026">
<mixed-citation publication-type="chapter"> <string-name><surname>Johnson</surname> <given-names>A</given-names></string-name>, <string-name><surname>Shmatikov</surname> <given-names>V</given-names></string-name> (<year>2013</year>). <chapter-title>Privacy-preserving data exploration in genome-wide association studies</chapter-title>. In: <source>Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>, <fpage>1079</fpage>–<lpage>1087</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_027">
<mixed-citation publication-type="chapter"> <string-name><surname>Kairouz</surname> <given-names>P</given-names></string-name>, <string-name><surname>Bonawitz</surname> <given-names>K</given-names></string-name>, <string-name><surname>Ramage</surname> <given-names>D</given-names></string-name> (<year>2016</year>). <chapter-title>Discrete distribution estimation under local privacy</chapter-title>. In: <source>International Conference on Machine Learning</source>, <fpage>2436</fpage>–<lpage>2444</lpage>. <publisher-name>PMLR</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_028">
<mixed-citation publication-type="other"> <string-name><surname>Karwa</surname> <given-names>V</given-names></string-name>, <string-name><surname>Krivitsky</surname> <given-names>PN</given-names></string-name>, <string-name><surname>Slavković</surname> <given-names>AB</given-names></string-name> (2015). Sharing social network data: differentially private estimation of exponential family random graph models. arXiv preprint: <uri>https://arxiv.org/abs/1511.02930</uri>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_029">
<mixed-citation publication-type="journal"> <string-name><surname>Karwa</surname> <given-names>V</given-names></string-name>, <string-name><surname>Slavković</surname> <given-names>A</given-names></string-name> (<year>2016</year>). <article-title>Inference using noisy degrees: differentially private model and synthetic graphs</article-title>. <source>The Annals of Statistics</source>, <volume>44</volume>(<issue>1</issue>): <fpage>87</fpage>–<lpage>112</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_030">
<mixed-citation publication-type="journal"> <string-name><surname>Little</surname> <given-names>RJ</given-names></string-name> (<year>1993</year>). <article-title>Statistical analysis of masked data</article-title>. <source>Journal of Official Statistics</source>, <volume>9</volume>(<issue>2</issue>): <fpage>407</fpage>–<lpage>426</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_031">
<mixed-citation publication-type="journal"> <string-name><surname>Liu</surname> <given-names>C</given-names></string-name>, <string-name><surname>He</surname> <given-names>X</given-names></string-name>, <string-name><surname>Chanyaswad</surname> <given-names>T</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>S</given-names></string-name>, <string-name><surname>Mittal</surname> <given-names>P</given-names></string-name> (<year>2019</year>). <article-title>Investigating statistical privacy frameworks from the perspective of hypothesis testing</article-title>. <source>Proceedings on Privacy Enhancing Technologies</source>, <volume>3</volume>: <fpage>233</fpage>–<lpage>254</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_032">
<mixed-citation publication-type="other"> <string-name><surname>Clifford R</surname> <given-names>M</given-names></string-name>, <string-name><surname>Bryant</surname> <given-names>D</given-names></string-name>, <string-name><surname>Burchinal</surname> <given-names>M</given-names></string-name>, <string-name><surname>Barbarin</surname> <given-names>O</given-names></string-name>, <string-name><surname>Early</surname> <given-names>D</given-names></string-name>, <string-name><surname>Howes</surname> <given-names>C</given-names></string-name>, et al. (2017). National Center for Early Development and Learning Multistate Study of Pre-Kindergarten. Inter-university Consortium for Political and Social Research [distributor].</mixed-citation>
</ref>
<ref id="j_jds1044_ref_033">
<mixed-citation publication-type="chapter"> <string-name><surname>Machanavajjhala</surname> <given-names>A</given-names></string-name>, <string-name><surname>Kifer</surname> <given-names>D</given-names></string-name>, <string-name><surname>Abowd</surname> <given-names>J</given-names></string-name>, <string-name><surname>Gehrke</surname> <given-names>J</given-names></string-name>, <string-name><surname>Vilhuber</surname> <given-names>L</given-names></string-name> (<year>2008</year>). <chapter-title>Privacy: Theory meets practice on the map</chapter-title>. In: <source>2008 IEEE 24th International Conference on Data Engineering</source>, <fpage>277</fpage>–<lpage>286</lpage>. <publisher-name>IEEE</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_034">
<mixed-citation publication-type="journal"> <string-name><surname>McClure</surname> <given-names>D</given-names></string-name>, <string-name><surname>Reiter</surname> <given-names>JP</given-names></string-name> (<year>2012</year>). <article-title>Differential privacy and statistical disclosure risk measures: an investigation with binary synthetic data</article-title>. <source>Trans. Data Priv.</source>, <volume>5</volume>(<issue>3</issue>): <fpage>535</fpage>–<lpage>552</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_035">
<mixed-citation publication-type="chapter"> <string-name><surname>Mohammed</surname> <given-names>N</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>R</given-names></string-name>, <string-name><surname>Fung</surname> <given-names>BC</given-names></string-name>, <string-name><surname>Yu</surname> <given-names>PS</given-names></string-name> (<year>2011</year>). <chapter-title>Differentially private data release for data mining</chapter-title>. In: <source>Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>, <fpage>493</fpage>–<lpage>501</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_036">
<mixed-citation publication-type="chapter"> <string-name><surname>Narayanan</surname> <given-names>A</given-names></string-name>, <string-name><surname>Shmatikov</surname> <given-names>V</given-names></string-name> (<year>2008</year>). <chapter-title>Robust de-anonymization of large sparse datasets</chapter-title>. In: <source>2008 IEEE Symposium on Security and Privacy (sp 2008)</source>, <fpage>111</fpage>–<lpage>125</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_037">
<mixed-citation publication-type="other"> <string-name><surname>Quick</surname> <given-names>H</given-names></string-name> (2019). Generating Poisson-distributed differentially private synthetic data. arXiv preprint: <uri>https://arxiv.org/abs/1906.00455</uri>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_038">
<mixed-citation publication-type="journal"> <string-name><surname>Raab</surname> <given-names>GM</given-names></string-name>, <string-name><surname>Nowok</surname> <given-names>B</given-names></string-name>, <string-name><surname>Dibben</surname> <given-names>C</given-names></string-name> (<year>2016</year>). <article-title>Practical data synthesis for large samples</article-title>. <source>Journal of Privacy and Confidentiality</source>, <volume>7</volume>(<issue>3</issue>): <fpage>67</fpage>–<lpage>97</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_039">
<mixed-citation publication-type="journal"> <string-name><surname>Raghunathan</surname> <given-names>TE</given-names></string-name>, <string-name><surname>Reiter</surname> <given-names>JP</given-names></string-name>, <string-name><surname>Rubin</surname> <given-names>DB</given-names></string-name> (<year>2003</year>). <article-title>Multiple imputation for statistical disclosure limitation</article-title>. <source>Journal of Official Statistics</source>, <volume>19</volume>(<issue>1</issue>): <fpage>1</fpage>–<lpage>16</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_040">
<mixed-citation publication-type="journal"> <string-name><surname>Reiter</surname> <given-names>JP</given-names></string-name> (<year>2005</year>). <article-title>Using CART to generate partially synthetic public use microdata</article-title>. <source>Journal of Official Statistics</source>, <volume>21</volume>(<issue>3</issue>): <fpage>441</fpage>–<lpage>462</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_041">
<mixed-citation publication-type="journal"> <string-name><surname>Rinott</surname> <given-names>Y</given-names></string-name>, <string-name><surname>O’Keefe CM</surname></string-name>, <string-name><surname>Shlomo</surname> <given-names>N</given-names></string-name>, <string-name><surname>Skinner</surname> <given-names>C</given-names></string-name> (<year>2018</year>). <article-title>Confidentiality and differential privacy in the dissemination of frequency tables</article-title>. <source>Statistical Science</source>, <volume>33</volume>(<issue>3</issue>): <fpage>358</fpage>–<lpage>385</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_042">
<mixed-citation publication-type="chapter"> <string-name><surname>Rogers</surname> <given-names>R</given-names></string-name>, <string-name><surname>Roth</surname> <given-names>A</given-names></string-name>, <string-name><surname>Smith</surname> <given-names>A</given-names></string-name>, <string-name><surname>Thakkar</surname> <given-names>O</given-names></string-name> (<year>2016</year>). <chapter-title>Max-information, differential privacy, and post-selection hypothesis testing</chapter-title>. In: <source>2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS)</source>, <fpage>487</fpage>–<lpage>494</lpage>. <publisher-name>IEEE</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_043">
<mixed-citation publication-type="journal"> <string-name><surname>Rubin</surname> <given-names>DB</given-names></string-name> (<year>1993</year>). <article-title>Statistical disclosure limitation</article-title>. <source>Journal of Official Statistics</source>, <volume>9</volume>(<issue>2</issue>): <fpage>461</fpage>–<lpage>468</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_044">
<mixed-citation publication-type="chapter"> <string-name><surname>Sheffet</surname> <given-names>O</given-names></string-name> (<year>2017</year>). <chapter-title>Differentially private ordinary least squares</chapter-title>. In: <source>International Conference on Machine Learning</source>, <fpage>3105</fpage>–<lpage>3114</lpage>. <publisher-name>PMLR</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_045">
<mixed-citation publication-type="other"> <string-name><surname>Snoke</surname> <given-names>J</given-names></string-name>, <string-name><surname>Raab</surname> <given-names>G</given-names></string-name>, <string-name><surname>Nowok</surname> <given-names>B</given-names></string-name>, <string-name><surname>Dibben</surname> <given-names>C</given-names></string-name>, <string-name><surname>Slavkovic</surname> <given-names>A</given-names></string-name> (2016). General and specific utility measures for synthetic data. arXiv preprint: <uri>https://arxiv.org/abs/1604.06651</uri>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_046">
<mixed-citation publication-type="other"> <string-name><surname>Sweeney</surname> <given-names>L</given-names></string-name> (2013). Matching known patients to health records in Washington state data. arXiv preprint: <uri>https://arxiv.org/abs/1307.1370</uri>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_047">
<mixed-citation publication-type="chapter"> <string-name><surname>Task</surname> <given-names>C</given-names></string-name>, <string-name><surname>Clifton</surname> <given-names>C</given-names></string-name> (<year>2016</year>). <chapter-title>Differentially private significance testing on paired-sample data</chapter-title>. In: <source>Proceedings of the 2016 SIAM International Conference on Data Mining (SDM)</source>, <fpage>153</fpage>–<lpage>161</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_048">
<mixed-citation publication-type="chapter"> <string-name><surname>Vu</surname> <given-names>D</given-names></string-name>, <string-name><surname>Slavkovic</surname> <given-names>A</given-names></string-name> (<year>2009</year>). <chapter-title>Differential privacy for clinical trial data: Preliminary evaluations</chapter-title>. In: <source>2009 IEEE International Conference on Data Mining Workshops</source>, <fpage>138</fpage>–<lpage>143</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_049">
<mixed-citation publication-type="chapter"> <string-name><surname>Wang</surname> <given-names>R</given-names></string-name>, <string-name><surname>Li</surname> <given-names>YF</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>X</given-names></string-name>, <string-name><surname>Tang</surname> <given-names>H</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>X</given-names></string-name> (<year>2009</year>). <chapter-title>Learning your identity and disease from research papers: information leaks in genome wide association study</chapter-title>. In: <source>Proceedings of the 16th ACM Conference on Computer and Communications Security</source>, <fpage>534</fpage>–<lpage>544</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_050">
<mixed-citation publication-type="other"> <string-name><surname>Wang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Lee</surname> <given-names>J</given-names></string-name>, <string-name><surname>Kifer</surname> <given-names>D</given-names></string-name> (2015a). Revisiting differentially private hypothesis tests for categorical data. arXiv preprint: <uri>https://arxiv.org/abs/1511.03376</uri>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_051">
<mixed-citation publication-type="chapter"> <string-name><surname>Wang</surname> <given-names>YX</given-names></string-name>, <string-name><surname>Fienberg</surname> <given-names>S</given-names></string-name>, <string-name><surname>Smola</surname> <given-names>A</given-names></string-name> (<year>2015</year>b). <chapter-title>Privacy for free: posterior sampling and stochastic gradient monte carlo</chapter-title>. In: <source>International Conference on Machine Learning</source>, <fpage>2493</fpage>–<lpage>2502</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_052">
<mixed-citation publication-type="journal"> <string-name><surname>Wasserman</surname> <given-names>L</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>S</given-names></string-name> (<year>2010</year>). <article-title>A statistical framework for differential privacy</article-title>. <source>Journal of the American Statistical Association</source>, <volume>105</volume>(<issue>489</issue>): <fpage>375</fpage>–<lpage>389</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1044_ref_053">
<mixed-citation publication-type="journal"> <string-name><surname>Yu</surname> <given-names>F</given-names></string-name>, <string-name><surname>Fienberg</surname> <given-names>SE</given-names></string-name>, <string-name><surname>Slavković</surname> <given-names>AB</given-names></string-name>, <string-name><surname>Uhler</surname> <given-names>C</given-names></string-name> (<year>2014</year>). <article-title>Scalable privacy-preserving data sharing methodology for genome-wide association studies</article-title>. <source>Journal of Biomedical Informatics</source>, <volume>50</volume>: <fpage>133</fpage>–<lpage>141</lpage>.</mixed-citation>
</ref>
</ref-list>
</back>
</article>
