<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">JDS</journal-id>
<journal-title-group><journal-title>Journal of Data Science</journal-title></journal-title-group>
<issn pub-type="epub">1683-8602</issn>
<issn pub-type="ppub">1680-743X</issn>
<issn-l>1680-743X</issn-l>
<publisher>
<publisher-name>School of Statistics, Renmin University of China</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">JDS1008</article-id>
<article-id pub-id-type="doi">10.6339/21-JDS1008</article-id>
<article-categories><subj-group subj-group-type="heading">
<subject>Statistical Data Science</subject></subj-group></article-categories>
<title-group>
<article-title>Spline Pattern-Mixture Models for Missing Data</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Yang</surname><given-names>Ye</given-names></name><email xlink:href="mailto:yeya@umich.edu">yeya@umich.edu</email><xref ref-type="aff" rid="j_jds1008_aff_001">1</xref><xref ref-type="corresp" rid="cor1">∗</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Little</surname><given-names>Roderick J.A.</given-names></name><xref ref-type="aff" rid="j_jds1008_aff_002">2</xref>
</contrib>
<aff id="j_jds1008_aff_001"><label>1</label><institution>Center for Biologics Evaluation and Research, FDA</institution>, Silver Spring, MD, <country>USA</country></aff>
<aff id="j_jds1008_aff_002"><label>2</label><institution>Department of Biostatistics, University of Michigan</institution>, Ann Arbor, MI, <country>USA</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>∗</label>Corresponding author. Email: <ext-link ext-link-type="uri" xlink:href="mailto:yeya@umich.edu">yeya@umich.edu</ext-link>.</corresp>
</author-notes>
<pub-date pub-type="ppub"><year>2021</year></pub-date><pub-date pub-type="epub"><day>10</day><month>2</month><year>2021</year></pub-date>
<volume>19</volume><issue>1</issue><fpage>75</fpage><lpage>95</lpage>
<supplementary-material id="S1" content-type="archive" xlink:href="jds1008_s001.zip" mimetype="application" mime-subtype="x-zip-compressed">
<caption>
<title>Supplementary Material</title>
<p>Please refer to the Supplementary Material document for:</p><p>1. A detailed description of the Gibbs sampling algorithm for the penalized spline prediction.</p><p>2. Results from all six simulation scenarios, including estimates from <inline-formula id="j_jds1008_ineq_001"><alternatives>
<mml:math><mml:mi mathvariant="italic">n</mml:mi><mml:mo>=</mml:mo><mml:mn>100</mml:mn></mml:math>
<tex-math><![CDATA[$n=100$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_jds1008_ineq_002"><alternatives>
<mml:math><mml:mi mathvariant="italic">n</mml:mi><mml:mo>=</mml:mo><mml:mn>400</mml:mn></mml:math>
<tex-math><![CDATA[$n=400$]]></tex-math></alternatives></inline-formula> and where <inline-formula id="j_jds1008_ineq_003"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">λ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">A</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="italic">λ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">T</mml:mi></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\lambda _{A}}={\lambda _{T}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_jds1008_ineq_004"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">λ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">A</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">≠</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="italic">λ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">T</mml:mi></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\lambda _{A}}\ne {\lambda _{T}}$]]></tex-math></alternatives></inline-formula>.</p><p>3. R code and workspace for the simulations.</p>
</caption>
</supplementary-material>
<history>
<date date-type="received"><month>11</month><year>2020</year></date>
<date date-type="accepted"><month>1</month><year>2021</year></date>
</history>
<permissions><copyright-statement>2021 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.</copyright-statement><copyright-year>2021</copyright-year>
<license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>We consider a continuous outcome subject to nonresponse and a fully observed covariate. We propose a spline proxy pattern-mixture model (S-PPMA), an extension of the proxy pattern-mixture model (PPMA) (Andridge and Little, <xref ref-type="bibr" rid="j_jds1008_ref_002">2011</xref>), to estimate the mean of the outcome under varying assumptions about nonresponse. S-PPMA improves the robustness of PPMA, which assumes bivariate normality between the outcome and the covariate, by modeling the relationship via a spline. Simulations indicate that S-PPMA outperforms PPMA when the data deviate from normality and are missing not at random, with minor losses of efficiency when the data are normal.</p>
</abstract>
<kwd-group>
<label>Keywords</label>
<kwd>missing data</kwd>
<kwd>missing not at random</kwd>
<kwd>nonignorable nonresponse</kwd>
<kwd>nonresponse bias</kwd>
</kwd-group>
</article-meta>
</front>
<back>
<ref-list id="j_jds1008_reflist_001">
<title>References</title>
<ref id="j_jds1008_ref_001">
<mixed-citation publication-type="journal"> <string-name><surname>Andridge</surname> <given-names>RR</given-names></string-name>, <string-name><surname>Little</surname> <given-names>RJA</given-names></string-name> (<year>2010</year>). <article-title>A review of hot deck imputation for survey nonresponse</article-title>. <source>International Statistical Review</source>, <volume>78</volume>(<issue>1</issue>): <fpage>40</fpage>–<lpage>64</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1008_ref_002">
<mixed-citation publication-type="journal"> <string-name><surname>Andridge</surname> <given-names>RR</given-names></string-name>, <string-name><surname>Little</surname> <given-names>RJA</given-names></string-name> (<year>2011</year>). <article-title>Proxy pattern-mixture analysis for survey nonresponse</article-title>. <source>Journal of Official Statistics</source>, <volume>27</volume>: <fpage>153</fpage>–<lpage>180</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1008_ref_003">
<mixed-citation publication-type="journal"> <string-name><surname>Little</surname> <given-names>RJA</given-names></string-name> (<year>1993</year>). <article-title>Pattern-mixture models for multivariate incomplete data</article-title>. <source>Journal of the American Statistical Association</source>, <volume>88</volume>: <fpage>125</fpage>–<lpage>134</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1008_ref_004">
<mixed-citation publication-type="journal"> <string-name><surname>Little</surname> <given-names>RJA</given-names></string-name> (<year>1994</year>). <article-title>A class of pattern-mixture models for normal incomplete data</article-title>. <source>Biometrika</source>, <volume>81</volume>: <fpage>471</fpage>–<lpage>483</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1008_ref_005">
<mixed-citation publication-type="journal"> <string-name><surname>Little</surname> <given-names>RJA</given-names></string-name>, <string-name><surname>An</surname> <given-names>H</given-names></string-name> (<year>2004</year>). <article-title>Robust likelihood-based analysis of multivariate data with missing values</article-title>. <source>Statistica Sinica</source>, <volume>14</volume>: <fpage>949</fpage>–<lpage>968</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1008_ref_006">
<mixed-citation publication-type="book"> <string-name><surname>Little</surname> <given-names>RJA</given-names></string-name>, <string-name><surname>Rubin</surname> <given-names>DB</given-names></string-name> (<year>2020</year>). <source>Statistical Analysis with Missing Data</source>. <publisher-name>Wiley</publisher-name>, <edition>Third</edition> Edition.</mixed-citation>
</ref>
<ref id="j_jds1008_ref_007">
<mixed-citation publication-type="journal"> <string-name><surname>Pfeffermann</surname> <given-names>D</given-names></string-name>, <string-name><surname>Sikov</surname> <given-names>A</given-names></string-name> (<year>2011</year>). <article-title>Imputation and estimation under nonignorable nonresponse in household surveys with missing covariate information</article-title>. <source>Journal of Official Statistics</source>, <volume>27</volume>: <fpage>181</fpage>–<lpage>209</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1008_ref_008">
<mixed-citation publication-type="journal"> <string-name><surname>Rubin</surname> <given-names>DB</given-names></string-name> (<year>1976</year>). <article-title>Inference and missing data</article-title>. <source>Biometrika</source>, <volume>63</volume>: <fpage>581</fpage>–<lpage>592</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1008_ref_009">
<mixed-citation publication-type="journal"> <string-name><surname>Schouten</surname> <given-names>B</given-names></string-name> (<year>2007</year>). <article-title>A selection strategy for weighting variables under a not-missing-at-random assumption</article-title>. <source>Journal of Official Statistics</source>, <volume>23</volume>: <fpage>51</fpage>–<lpage>68</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1008_ref_010">
<mixed-citation publication-type="journal"> <string-name><surname>Yang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Little</surname> <given-names>RJA</given-names></string-name> (<year>2015</year>). <article-title>A comparison of doubly robust estimators of the mean with missing data</article-title>. <source>Journal of Statistical Computation and Simulation</source>, <volume>85</volume>: <fpage>3383</fpage>–<lpage>3403</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1008_ref_011">
<mixed-citation publication-type="journal"> <string-name><surname>Zhang</surname> <given-names>G</given-names></string-name>, <string-name><surname>Little</surname> <given-names>RJA</given-names></string-name> (<year>2009</year>). <article-title>Extensions of the penalized spline of propensity prediction method of imputation</article-title>. <source>Biometrics</source>, <volume>65</volume>(<issue>3</issue>): <fpage>911</fpage>–<lpage>918</lpage>.</mixed-citation>
</ref>
</ref-list>
</back>
</article>
