<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">JDS</journal-id>
<journal-title-group><journal-title>Journal of Data Science</journal-title></journal-title-group>
<issn pub-type="epub">1683-8602</issn><issn pub-type="ppub">1680-743X</issn><issn-l>1680-743X</issn-l>
<publisher>
<publisher-name>School of Statistics, Renmin University of China</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">JDS1186</article-id>
<article-id pub-id-type="doi">10.6339/25-JDS1186</article-id>
<article-categories><subj-group subj-group-type="heading">
<subject>Statistical Data Science</subject></subj-group></article-categories>
<title-group>
<article-title>Impact of Data Perturbation for Statistical Disclosure Control on the Predictive Performance of Machine Learning Techniques</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Johnson III</surname><given-names>Thomas</given-names></name><xref ref-type="aff" rid="j_jds1186_aff_001">1</xref>
</contrib>
<contrib contrib-type="author">
<contrib-id contrib-id-type="orcid">https://orcid.org/0000-0002-8113-702X</contrib-id>
<name><surname>Mostafa</surname><given-names>Sayed A.</given-names></name><email xlink:href="mailto:sabdelmegeed@ncat.edu">sabdelmegeed@ncat.edu</email><xref ref-type="aff" rid="j_jds1186_aff_001">1</xref><xref ref-type="corresp" rid="cor1">∗</xref>
</contrib>
<aff id="j_jds1186_aff_001"><label>1</label>Department of Mathematics &amp; Statistics, <institution>North Carolina A&amp;T State University</institution>, Greensboro, NC, <country>USA</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>∗</label>Corresponding author. Email: <ext-link ext-link-type="uri" xlink:href="mailto:sabdelmegeed@ncat.edu">sabdelmegeed@ncat.edu</ext-link>.</corresp>
</author-notes>
<pub-date pub-type="ppub"><year>2025</year></pub-date><pub-date pub-type="epub"><day>23</day><month>4</month><year>2025</year></pub-date><volume>23</volume><issue>2</issue><fpage>312</fpage><lpage>331</lpage><supplementary-material id="S1" content-type="archive" xlink:href="jds1186_s001.zip" mimetype="application" mime-subtype="x-zip-compressed">
<caption>
<title>Supplementary Material</title>
<p>The supplementary material includes the following: (1) README: a brief explanation of the supplementary material; (2) a detailed description of the predictive machine learning techniques compared in this paper and additional simulation results; and (3) R code files.</p>
</caption>
</supplementary-material><history><date date-type="received"><day>17</day><month>8</month><year>2024</year></date><date date-type="accepted"><day>9</day><month>4</month><year>2025</year></date></history>
<permissions><copyright-statement>2025 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.</copyright-statement><copyright-year>2025</copyright-year>
<license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>The rapid accumulation and release of data have fueled research across various fields. While numerous methods exist for data collection and storage, data distribution presents challenges, as some datasets are restricted, and certain subsets may compromise privacy if released unaltered. Statistical disclosure control (SDC) aims to maximize data utility while minimizing the disclosure risk, i.e., the risk of individual identification. A key SDC method is data perturbation, with General Additive Data Perturbation (GADP) and Copula General Additive Data Perturbation (CGADP) being two prominent approaches. Both leverage multivariate normal distributions to generate synthetic data while preserving statistical properties of the original dataset. Given the increasing use of machine learning for data modeling, this study compares the performance of various machine learning models on GADP- and CGADP-perturbed data. Using Monte Carlo simulations with three data-generating models and a real dataset, we evaluate the predictive performance and robustness of ten machine learning techniques under data perturbation. Our findings provide insights into the machine learning techniques that perform robustly on GADP- and CGADP-perturbed datasets, extending previous research that primarily focused on simple statistics such as means, variances, and correlations.</p>
</abstract>
<kwd-group>
<label>Keywords</label>
<kwd>data confidentiality</kwd>
<kwd>data perturbation</kwd>
<kwd>machine learning</kwd>
<kwd>predictive modeling</kwd>
<kwd>statistical disclosure control</kwd>
</kwd-group>
</article-meta>
</front>
<back>
<ref-list id="j_jds1186_reflist_001">
<title>References</title>
<ref id="j_jds1186_ref_001">
<mixed-citation publication-type="journal"> <string-name><surname>Blanco-Justicia</surname> <given-names>A</given-names></string-name>, <string-name><surname>Sánchez</surname> <given-names>D</given-names></string-name>, <string-name><surname>Domingo-Ferrer</surname> <given-names>J</given-names></string-name>, <string-name><surname>Muralidhar</surname> <given-names>K</given-names></string-name> (<year>2022</year>). <article-title>A critical review on the use (and misuse) of differential privacy in machine learning</article-title>. <source><italic>ACM Computing Surveys</italic></source>, <volume>55</volume>(<issue>8</issue>): <fpage>1</fpage>–<lpage>16</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1186_ref_002">
<mixed-citation publication-type="journal"> <string-name><surname>Breiman</surname> <given-names>L</given-names></string-name> (<year>2001</year>). <article-title>Random forests</article-title>. <source><italic>Machine Learning</italic></source>, <volume>45</volume>(<issue>1</issue>): <fpage>5</fpage>–<lpage>32</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1023/A:1010933404324" xlink:type="simple">https://doi.org/10.1023/A:1010933404324</ext-link></mixed-citation>
</ref>
<ref id="j_jds1186_ref_003">
<mixed-citation publication-type="journal"> <string-name><surname>Carlson</surname> <given-names>M</given-names></string-name>, <string-name><surname>Salabasis</surname> <given-names>M</given-names></string-name> (<year>2002</year>). <article-title>A data-swapping technique using ranks—a method for disclosure control</article-title>. <source><italic>Research in Official Statistics</italic></source>, <volume>6</volume>(<issue>2</issue>): <fpage>35</fpage>–<lpage>64</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1186_ref_004">
<mixed-citation publication-type="chapter"> <string-name><surname>Chen</surname> <given-names>T</given-names></string-name>, <string-name><surname>Guestrin</surname> <given-names>C</given-names></string-name> (<year>2016</year>). <chapter-title>XGBoost: A scalable tree boosting system</chapter-title>. In: <source><italic>Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</italic></source>, <fpage>785</fpage>–<lpage>794</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1145/2939672.2939785" xlink:type="simple">https://doi.org/10.1145/2939672.2939785</ext-link></mixed-citation>
</ref>
<ref id="j_jds1186_ref_005">
<mixed-citation publication-type="journal"> <string-name><surname>Chu</surname> <given-names>AM</given-names></string-name>, <string-name><surname>Ip</surname> <given-names>CY</given-names></string-name>, <string-name><surname>Lam</surname> <given-names>BS</given-names></string-name>, <string-name><surname>So</surname> <given-names>MK</given-names></string-name> (<year>2022</year>). <article-title>Statistical disclosure control for continuous variables using an extended skew-t copula</article-title>. <source><italic>Applied Stochastic Models in Business and Industry</italic></source>, <volume>38</volume>(<issue>1</issue>): <fpage>96</fpage>–<lpage>115</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1002/asmb.2650" xlink:type="simple">https://doi.org/10.1002/asmb.2650</ext-link></mixed-citation>
</ref>
<ref id="j_jds1186_ref_006">
<mixed-citation publication-type="journal"> <string-name><surname>Chu</surname> <given-names>AM</given-names></string-name>, <string-name><surname>Lam</surname> <given-names>BS</given-names></string-name>, <string-name><surname>Tiwari</surname> <given-names>A</given-names></string-name>, <string-name><surname>So</surname> <given-names>MK</given-names></string-name> (<year>2019</year>). <article-title>An empirical study of applying statistical disclosure control methods to public health research</article-title>. <source><italic>International Journal of Environmental Research and Public Health</italic></source>, <volume>16</volume>(<issue>22</issue>): <fpage>4519</fpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.3390/ijerph16224519" xlink:type="simple">https://doi.org/10.3390/ijerph16224519</ext-link></mixed-citation>
</ref>
<ref id="j_jds1186_ref_007">
<mixed-citation publication-type="journal"> <string-name><surname>Duroux</surname> <given-names>R</given-names></string-name>, <string-name><surname>Scornet</surname> <given-names>E</given-names></string-name> (<year>2018</year>). <article-title>Impact of subsampling and tree depth on random forests</article-title>. <source><italic>ESAIM: Probability and Statistics</italic></source>, <volume>22</volume>: <fpage>96</fpage>–<lpage>128</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1051/ps/2018008" xlink:type="simple">https://doi.org/10.1051/ps/2018008</ext-link></mixed-citation>
</ref>
<ref id="j_jds1186_ref_008">
<mixed-citation publication-type="other"> <string-name><surname>Elliot</surname> <given-names>M</given-names></string-name>, <string-name><surname>Domingo-Ferrer</surname> <given-names>J</given-names></string-name> (<year>2018</year>). The future of statistical disclosure control. <italic>CoRR</italic>, abs/1812.09204.</mixed-citation>
</ref>
<ref id="j_jds1186_ref_009">
<mixed-citation publication-type="journal"> <string-name><surname>Estes</surname> <given-names>JP</given-names></string-name>, <string-name><surname>Mukherjee</surname> <given-names>B</given-names></string-name>, <string-name><surname>Taylor</surname> <given-names>JM</given-names></string-name> (<year>2018</year>). <article-title>Empirical Bayes estimation and prediction using summary-level information from external big data sources adjusting for violations of transportability</article-title>. <source><italic>Statistics in Biosciences</italic></source>, <volume>10</volume>: <fpage>568</fpage>–<lpage>586</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1007/s12561-018-9217-4" xlink:type="simple">https://doi.org/10.1007/s12561-018-9217-4</ext-link></mixed-citation>
</ref>
<ref id="j_jds1186_ref_010">
<mixed-citation publication-type="journal"> <string-name><surname>Gu</surname> <given-names>T</given-names></string-name>, <string-name><surname>Taylor</surname> <given-names>JM</given-names></string-name>, <string-name><surname>Cheng</surname> <given-names>W</given-names></string-name>, <string-name><surname>Mukherjee</surname> <given-names>B</given-names></string-name> (<year>2019</year>). <article-title>Synthetic data method to incorporate external information into a current study</article-title>. <source><italic>Canadian Journal of Statistics</italic></source>, <volume>47</volume>(<issue>4</issue>): <fpage>580</fpage>–<lpage>603</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1002/cjs.11513" xlink:type="simple">https://doi.org/10.1002/cjs.11513</ext-link></mixed-citation>
</ref>
<ref id="j_jds1186_ref_011">
<mixed-citation publication-type="book"> <string-name><surname>Hastie</surname> <given-names>T</given-names></string-name>, <string-name><surname>Tibshirani</surname> <given-names>R</given-names></string-name>, <string-name><surname>Friedman</surname> <given-names>JH</given-names></string-name> (<year>2009</year>). <source><italic>The Elements of Statistical Learning: Data Mining, Inference, and Prediction</italic></source>, volume <volume>2</volume>. <publisher-name>Springer</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1186_ref_012">
<mixed-citation publication-type="journal"> <string-name><surname>Hoshino</surname> <given-names>N</given-names></string-name> (<year>2020</year>). <article-title>A firm foundation for statistical disclosure control</article-title>. <source><italic>Japanese Journal of Statistics and Data Science</italic></source>, <volume>3</volume>: <fpage>721</fpage>–<lpage>746</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1007/s42081-020-00086-9" xlink:type="simple">https://doi.org/10.1007/s42081-020-00086-9</ext-link></mixed-citation>
</ref>
<ref id="j_jds1186_ref_013">
<mixed-citation publication-type="journal"> <string-name><surname>Hu</surname> <given-names>J</given-names></string-name>, <string-name><surname>Drechsler</surname> <given-names>J</given-names></string-name>, <string-name><surname>Kim</surname> <given-names>HJ</given-names></string-name> (<year>2022</year>a). <article-title>Accuracy gains from privacy amplification through sampling for differential privacy</article-title>. <source><italic>Journal of Survey Statistics and Methodology</italic></source>, <volume>10</volume>(<issue>3</issue>): <fpage>688</fpage>–<lpage>719</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1093/jssam/smac012" xlink:type="simple">https://doi.org/10.1093/jssam/smac012</ext-link></mixed-citation>
</ref>
<ref id="j_jds1186_ref_014">
<mixed-citation publication-type="journal"> <string-name><surname>Hu</surname> <given-names>J</given-names></string-name>, <string-name><surname>Savitsky</surname> <given-names>TD</given-names></string-name>, <string-name><surname>Williams</surname> <given-names>MR</given-names></string-name> (<year>2022</year>b). <article-title>Private tabular survey data products through synthetic microdata generation</article-title>. <source><italic>Journal of Survey Statistics and Methodology</italic></source>, <volume>10</volume>(<issue>3</issue>): <fpage>720</fpage>–<lpage>752</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1093/jssam/smac001" xlink:type="simple">https://doi.org/10.1093/jssam/smac001</ext-link></mixed-citation>
</ref>
<ref id="j_jds1186_ref_015">
<mixed-citation publication-type="journal"> <string-name><surname>Kokosi</surname> <given-names>T</given-names></string-name>, <string-name><surname>De Stavola</surname> <given-names>B</given-names></string-name>, <string-name><surname>Mitra</surname> <given-names>R</given-names></string-name>, <string-name><surname>Frayling</surname> <given-names>L</given-names></string-name>, <string-name><surname>Doherty</surname> <given-names>A</given-names></string-name>, <string-name><surname>Dove</surname> <given-names>I</given-names></string-name>, et al. (<year>2022</year>). <article-title>An overview of synthetic administrative data for research</article-title>. <source><italic>International Journal of Population Data Science</italic></source>, <volume>7</volume>(<issue>1</issue>). <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.23889/ijpds.v7i1.1727" xlink:type="simple">https://doi.org/10.23889/ijpds.v7i1.1727</ext-link></mixed-citation>
</ref>
<ref id="j_jds1186_ref_016">
<mixed-citation publication-type="journal"> <string-name><surname>Li</surname> <given-names>B</given-names></string-name>, <string-name><surname>Li</surname> <given-names>X</given-names></string-name>, <string-name><surname>Zhao</surname> <given-names>Z</given-names></string-name> (<year>2006</year>). <article-title>Novel algorithm for constructing support vector machine regression ensemble</article-title>. <source><italic>Journal of Systems Engineering and Electronics</italic></source>, <volume>17</volume>(<issue>3</issue>): <fpage>541</fpage>–<lpage>545</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1016/S1004-4132(06)60093-5" xlink:type="simple">https://doi.org/10.1016/S1004-4132(06)60093-5</ext-link></mixed-citation>
</ref>
<ref id="j_jds1186_ref_017">
<mixed-citation publication-type="other"> <string-name><surname>Lundell</surname> <given-names>JF</given-names></string-name> (<year>2023</year>). Tuning support vector machines and boosted trees using optimization algorithms. <italic>arXiv preprint arXiv:</italic><ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/2303.07400"><italic>2303.07400</italic></ext-link>.</mixed-citation>
</ref>
<ref id="j_jds1186_ref_018">
<mixed-citation publication-type="other"> <string-name><surname>McConville</surname> <given-names>K</given-names></string-name> (<year>2011</year>). Improved estimation for complex surveys using modern regression techniques. Ph.D. thesis, Colorado State University.</mixed-citation>
</ref>
<ref id="j_jds1186_ref_019">
<mixed-citation publication-type="other"> <string-name><surname>Meyer</surname> <given-names>D</given-names></string-name>, <string-name><surname>Dimitriadou</surname> <given-names>E</given-names></string-name>, <string-name><surname>Hornik</surname> <given-names>K</given-names></string-name>, <string-name><surname>Weingessel</surname> <given-names>A</given-names></string-name>, <string-name><surname>Leisch</surname> <given-names>F</given-names></string-name> (<year>2023</year>). <italic>e1071: Misc Functions of the Department of Statistics, Probability Theory Group. (Formerly: E1071), TU Wien</italic>. R package version 1.7-14.</mixed-citation>
</ref>
<ref id="j_jds1186_ref_020">
<mixed-citation publication-type="journal"> <string-name><surname>Moore</surname> <given-names>R</given-names></string-name> (<year>1996</year>). <article-title>Controlled data swapping for masking public use microdata sets</article-title>. <source><italic>US Census Bureau Research Report</italic></source>, <volume>96</volume>(<issue>04</issue>).</mixed-citation>
</ref>
<ref id="j_jds1186_ref_021">
<mixed-citation publication-type="journal"> <string-name><surname>Muralidhar</surname> <given-names>K</given-names></string-name>, <string-name><surname>Parsa</surname> <given-names>R</given-names></string-name>, <string-name><surname>Sarathy</surname> <given-names>R</given-names></string-name> (<year>1999</year>). <article-title>A general additive data perturbation method for database security</article-title>. <source><italic>Management Science</italic></source>, <volume>45</volume>(<issue>10</issue>): <fpage>1399</fpage>–<lpage>1415</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1287/mnsc.45.10.1399" xlink:type="simple">https://doi.org/10.1287/mnsc.45.10.1399</ext-link></mixed-citation>
</ref>
<ref id="j_jds1186_ref_022">
<mixed-citation publication-type="journal"> <string-name><surname>Muralidhar</surname> <given-names>K</given-names></string-name>, <string-name><surname>Sarathy</surname> <given-names>R</given-names></string-name> (<year>2003</year>). <article-title>A theoretical basis for perturbation methods</article-title>. <source><italic>Statistics and Computing</italic></source>, <volume>13</volume>: <fpage>329</fpage>–<lpage>335</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1023/A:1025610705286" xlink:type="simple">https://doi.org/10.1023/A:1025610705286</ext-link></mixed-citation>
</ref>
<ref id="j_jds1186_ref_023">
<mixed-citation publication-type="journal"> <string-name><surname>Muralidhar</surname> <given-names>K</given-names></string-name>, <string-name><surname>Sarathy</surname> <given-names>R</given-names></string-name> (<year>2005</year>). <article-title>An enhanced data perturbation approach for small data sets</article-title>. <source><italic>Decision Sciences</italic></source>, <volume>36</volume>(<issue>3</issue>): <fpage>513</fpage>–<lpage>529</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1111/j.1540-5414.2005.00082.x" xlink:type="simple">https://doi.org/10.1111/j.1540-5414.2005.00082.x</ext-link></mixed-citation>
</ref>
<ref id="j_jds1186_ref_024">
<mixed-citation publication-type="journal"> <string-name><surname>Muralidhar</surname> <given-names>K</given-names></string-name>, <string-name><surname>Sarathy</surname> <given-names>R</given-names></string-name> (<year>2006</year>). <article-title>Data shuffling—a new masking approach for numerical data</article-title>. <source><italic>Management Science</italic></source>, <volume>52</volume>(<issue>5</issue>): <fpage>658</fpage>–<lpage>670</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1287/mnsc.1050.0503" xlink:type="simple">https://doi.org/10.1287/mnsc.1050.0503</ext-link></mixed-citation>
</ref>
<ref id="j_jds1186_ref_025">
<mixed-citation publication-type="book"> <collab>R Core Team</collab> (<year>2022</year>). <source><italic>R: A Language and Environment for Statistical Computing</italic></source>. <publisher-name>R Foundation for Statistical Computing</publisher-name>, <publisher-loc>Vienna, Austria</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1186_ref_026">
<mixed-citation publication-type="journal"> <string-name><surname>Sarathy</surname> <given-names>R</given-names></string-name>, <string-name><surname>Muralidhar</surname> <given-names>K</given-names></string-name>, <string-name><surname>Parsa</surname> <given-names>R</given-names></string-name> (<year>2002</year>). <article-title>Perturbing nonnormal confidential attributes: The copula approach</article-title>. <source><italic>Management Science</italic></source>, <volume>48</volume>(<issue>12</issue>): <fpage>1613</fpage>–<lpage>1627</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1287/mnsc.48.12.1613.439" xlink:type="simple">https://doi.org/10.1287/mnsc.48.12.1613.439</ext-link></mixed-citation>
</ref>
<ref id="j_jds1186_ref_027">
<mixed-citation publication-type="other"> <string-name><surname>Shen</surname> <given-names>X</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Shen</surname> <given-names>R</given-names></string-name> (<year>2023</year>). Boosting data analytics with synthetic volume expansion. arXiv preprint: <uri>https://arxiv.org/abs/2310.17848</uri>.</mixed-citation>
</ref>
<ref id="j_jds1186_ref_028">
<mixed-citation publication-type="journal"> <string-name><surname>Tay</surname> <given-names>JK</given-names></string-name>, <string-name><surname>Narasimhan</surname> <given-names>B</given-names></string-name>, <string-name><surname>Hastie</surname> <given-names>T</given-names></string-name> (<year>2023</year>). <article-title>Elastic net regularization paths for all generalized linear models</article-title>. <source><italic>Journal of Statistical Software</italic></source>, <volume>106</volume>(<issue>1</issue>): <fpage>1</fpage>–<lpage>31</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.18637/jss.v106.i01" xlink:type="simple">https://doi.org/10.18637/jss.v106.i01</ext-link></mixed-citation>
</ref>
<ref id="j_jds1186_ref_029">
<mixed-citation publication-type="other"> <string-name><surname>Toth</surname> <given-names>D</given-names></string-name> (<year>2021</year>). <italic>rpms: Recursive Partitioning for Modeling Survey Data</italic>. R package version 0.5.1.</mixed-citation>
</ref>
<ref id="j_jds1186_ref_030">
<mixed-citation publication-type="book"> <string-name><surname>Venables</surname> <given-names>WN</given-names></string-name>, <string-name><surname>Ripley</surname> <given-names>BD</given-names></string-name> (<year>2002</year>). <source><italic>Modern Applied Statistics with S</italic></source>. <publisher-name>Springer</publisher-name>, <publisher-loc>New York</publisher-loc>. ISBN <isbn>0-387-95457-0</isbn>.</mixed-citation>
</ref>
<ref id="j_jds1186_ref_031">
<mixed-citation publication-type="chapter"> <string-name><surname>Wang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Wu</surname> <given-names>X</given-names></string-name>, <string-name><surname>Hu</surname> <given-names>D</given-names></string-name> (<year>2016</year>). <chapter-title>Using randomized response for differential privacy preserving data collection</chapter-title>. In: <string-name><given-names>Themis</given-names> <surname>Palpanas</surname></string-name> and <string-name><given-names>Kostas</given-names> <surname>Stefanidis</surname></string-name> (Eds.), <source><italic>Proceedings of the Workshops of the EDBT/ICDT 2016 Joint Conference</italic></source>, volume <volume>1558</volume>, <fpage>0090</fpage>–<lpage>6778</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1186_ref_032">
<mixed-citation publication-type="book"> <string-name><surname>Willenborg</surname> <given-names>L</given-names></string-name>, <string-name><surname>de Waal</surname> <given-names>T</given-names></string-name> (<year>2001</year>). <source><italic>Elements of Statistical Disclosure Control</italic></source>. <publisher-name>Springer</publisher-name>, <publisher-loc>New York</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1186_ref_033">
<mixed-citation publication-type="journal"> <string-name><surname>Wright</surname> <given-names>MN</given-names></string-name>, <string-name><surname>Ziegler</surname> <given-names>A</given-names></string-name> (<year>2017</year>). <article-title>ranger: A fast implementation of random forests for high dimensional data in C++ and R</article-title>. <source><italic>Journal of Statistical Software</italic></source>, <volume>77</volume>(<issue>1</issue>): <fpage>1</fpage>–<lpage>17</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.18637/jss.v077.i01" xlink:type="simple">https://doi.org/10.18637/jss.v077.i01</ext-link></mixed-citation>
</ref>
</ref-list>
</back>
</article>
