<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">JDS</journal-id>
<journal-title-group><journal-title>Journal of Data Science</journal-title></journal-title-group>
<issn pub-type="epub">1683-8602</issn><issn pub-type="ppub">1680-743X</issn><issn-l>1680-743X</issn-l>
<publisher>
<publisher-name>School of Statistics, Renmin University of China</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">JDS1188</article-id>
<article-id pub-id-type="doi">10.6339/25-JDS1188</article-id>
<article-categories><subj-group subj-group-type="heading">
<subject>Statistical Data Science</subject></subj-group></article-categories>
<title-group>
<article-title>Exploring Massive Risk Factors of Categorical Outcomes via Supervised Dimension Reduction</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Li</surname><given-names>Yan</given-names></name><xref ref-type="aff" rid="j_jds1188_aff_001">1</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Alemdjrodo</surname><given-names>Kangni</given-names></name><xref ref-type="aff" rid="j_jds1188_aff_002">2</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Lin</surname><given-names>Yanzhu</given-names></name><xref ref-type="aff" rid="j_jds1188_aff_003">3</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Zhang</surname><given-names>Min</given-names></name><xref ref-type="aff" rid="j_jds1188_aff_001">1</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Zhang</surname><given-names>Dabao</given-names></name><email xlink:href="mailto:dabao.zhang@uci.edu">dabao.zhang@uci.edu</email><xref ref-type="aff" rid="j_jds1188_aff_001">1</xref><xref ref-type="corresp" rid="cor1">∗</xref>
</contrib>
<aff id="j_jds1188_aff_001"><label>1</label>Department of Epidemiology and Biostatistics, <institution>University of California</institution>, Irvine, CA 92617, <country>United States</country></aff>
<aff id="j_jds1188_aff_002"><label>2</label>Department of Mathematics and Statistics, <institution>Georgia State University</institution>, Atlanta, GA 30303, <country>United States</country></aff>
<aff id="j_jds1188_aff_003"><label>3</label><institution>Eli Lilly and Company</institution>, Indianapolis, IN 46285, <country>United States</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>∗</label>Corresponding author. Email: <ext-link ext-link-type="uri" xlink:href="mailto:dabao.zhang@uci.edu">dabao.zhang@uci.edu</ext-link>.</corresp>
</author-notes>
<pub-date pub-type="ppub"><year>2025</year></pub-date><pub-date pub-type="epub"><day>27</day><month>5</month><year>2025</year></pub-date><volume>23</volume><issue>4</issue><fpage>607</fpage><lpage>623</lpage><supplementary-material id="S1" content-type="archive" xlink:href="jds1188_s001.zip" mimetype="application" mime-subtype="x-zip-compressed">
<caption>
<title>Supplementary Material</title>
<p>The MATLAB code for gPOCRE is available on the journal’s website. The ISOLET data by <xref ref-type="bibr" rid="j_jds1188_ref_007">Fanty and Cole</xref> (<xref ref-type="bibr" rid="j_jds1188_ref_007">1990</xref>) can be downloaded from <uri>https://www.openml.org/search?type=data&amp;sort=version&amp;status=any&amp;order=asc&amp;exact_name=isolet&amp;id=41966</uri>, and the breast cancer data can be found in the R package mixOmics (<uri>https://mixomics.org/</uri>).</p>
</caption>
</supplementary-material><history><date date-type="received"><day>29</day><month>9</month><year>2024</year></date><date date-type="accepted"><day>2</day><month>5</month><year>2025</year></date></history>
<permissions><copyright-statement>2025 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.</copyright-statement><copyright-year>2025</copyright-year>
<license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>We propose to explore high-dimensional data with categorical outcomes by generalizing the penalized orthogonal-components regression method (POCRE), a supervised dimension reduction method initially proposed for high-dimensional linear regression. This generalized POCRE, i.e., gPOCRE, sequentially builds up orthogonal components by selecting predictors which maximally explain the variation of the response variables. Therefore, gPOCRE simultaneously selects significant predictors and reduces dimensions by constructing linear components of these selected predictors for a high-dimensional generalized linear model. For multiple categorical outcomes, gPOCRE can also construct common components shared by all outcomes to improve the power of selecting variables shared by multiple outcomes. Both simulation studies and real data analysis are carried out to illustrate the performance of gPOCRE.</p>
</abstract>
<kwd-group>
<label>Keywords</label>
<kwd>gPOCRE</kwd>
<kwd>latent model</kwd>
<kwd>logistic regression</kwd>
<kwd>multinomial regression</kwd>
<kwd>orthogonal components</kwd>
</kwd-group>
<funding-group><funding-statement>This research was partially supported by NSF CAREER award IIS-0844945, NIH grants R01GM131491, R01GM131491-02S1, R01GM131491-02S2, R01AG080917, and R01AG080917-02S1, NCI grant P30CA062203, and UCI Anti-Cancer Challenge funds from the UC Irvine Comprehensive Cancer Center. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or the Chao Family Comprehensive Cancer Center.</funding-statement></funding-group>
</article-meta>
</front>
<back>
<ref-list id="j_jds1188_reflist_001">
<title>References</title>
<ref id="j_jds1188_ref_001">
<mixed-citation publication-type="journal"> <string-name><surname>Boulesteix</surname> <given-names>AL</given-names></string-name>, <string-name><surname>Strimmer</surname> <given-names>K</given-names></string-name> (<year>2006</year>). <article-title>Partial least squares: A versatile tool for the analysis of high-dimensional genomic data</article-title>. <source><italic>Briefings in Bioformatics</italic></source>, <volume>8</volume>: <fpage>32</fpage>–<lpage>44</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1093/bib/bbl016" xlink:type="simple">https://doi.org/10.1093/bib/bbl016</ext-link></mixed-citation>
</ref>
<ref id="j_jds1188_ref_002">
<mixed-citation publication-type="journal"> <string-name><surname>Chun</surname> <given-names>H</given-names></string-name>, <string-name><surname>Keleş</surname> <given-names>S</given-names></string-name> (<year>2010</year>). <article-title>Sparse partial least squares regression for simultaneous dimension reduction and variable selection</article-title>. <source><italic>Journal of the Royal Statistical Society Series B: Statistical Methodology</italic></source>, <volume>72</volume>(<issue>1</issue>): <fpage>3</fpage>–<lpage>25</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1111/j.1467-9868.2009.00723.x" xlink:type="simple">https://doi.org/10.1111/j.1467-9868.2009.00723.x</ext-link></mixed-citation>
</ref>
<ref id="j_jds1188_ref_003">
<mixed-citation publication-type="journal"> <string-name><surname>Chung</surname> <given-names>D</given-names></string-name>, <string-name><surname>Keles</surname> <given-names>S</given-names></string-name> (<year>2010</year>). <article-title>Sparse partial least squares classification for high dimensional data</article-title>. <source><italic>Statistical Applications in Genetics and Molecular Biology</italic></source>, <volume>9</volume>. Article <elocation-id>17</elocation-id>.</mixed-citation>
</ref>
<ref id="j_jds1188_ref_004">
<mixed-citation publication-type="journal"> <string-name><surname>De Jong</surname> <given-names>S</given-names></string-name> (<year>1993</year>). <article-title>Simpls: An alternative approach to partial least squares regression</article-title>. <source><italic>Chemometrics and Intelligent Laboratory Systems</italic></source>, <volume>18</volume>: <fpage>251</fpage>–<lpage>263</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1016/0169-7439(93)85002-X" xlink:type="simple">https://doi.org/10.1016/0169-7439(93)85002-X</ext-link></mixed-citation>
</ref>
<ref id="j_jds1188_ref_005">
<mixed-citation publication-type="journal"> <string-name><surname>Fan</surname> <given-names>J</given-names></string-name>, <string-name><surname>Li</surname> <given-names>R</given-names></string-name> (<year>2001</year>). <article-title>Variable selection via nonconcave penalized likelihood and its oracle properties</article-title>. <source><italic>Journal of the American Statistical Association</italic></source>, <volume>96</volume>: <fpage>1348</fpage>–<lpage>1360</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1198/016214501753382273" xlink:type="simple">https://doi.org/10.1198/016214501753382273</ext-link></mixed-citation>
</ref>
<ref id="j_jds1188_ref_006">
<mixed-citation publication-type="journal"> <string-name><surname>Fan</surname> <given-names>J</given-names></string-name>, <string-name><surname>Samworth</surname> <given-names>R</given-names></string-name>, <string-name><surname>Wu</surname> <given-names>Y</given-names></string-name> (<year>2009</year>). <article-title>Ultrahigh dimensional feature selection: Beyond the linear model</article-title>. <source><italic>Journal of Machine Learning Research</italic></source>, <volume>10</volume>: <fpage>2013</fpage>–<lpage>2038</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1188_ref_007">
<mixed-citation publication-type="journal"> <string-name><surname>Fanty</surname> <given-names>M</given-names></string-name>, <string-name><surname>Cole</surname> <given-names>R</given-names></string-name> (<year>1990</year>). <article-title>Spoken letter recognition</article-title>. <source><italic>Proceedings of the International Conference on Neural Information Processing Systems</italic></source>, <volume>4</volume>: <fpage>220</fpage>–<lpage>226</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1188_ref_008">
<mixed-citation publication-type="journal"> <string-name><surname>Fisher</surname> <given-names>RA</given-names></string-name> (<year>1936</year>). <article-title>The use of multiple measurements in taxonomic problems</article-title>. <source><italic>Annals of Eugenics</italic></source>, <volume>7</volume>(<issue>2</issue>): <fpage>179</fpage>–<lpage>188</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1111/j.1469-1809.1936.tb02137.x" xlink:type="simple">https://doi.org/10.1111/j.1469-1809.1936.tb02137.x</ext-link></mixed-citation>
</ref>
<ref id="j_jds1188_ref_009">
<mixed-citation publication-type="journal"> <string-name><surname>Freeman</surname> <given-names>C</given-names></string-name>, <string-name><surname>Kulić</surname> <given-names>D</given-names></string-name>, <string-name><surname>Basir</surname> <given-names>O</given-names></string-name> (<year>2013</year>). <article-title>Feature-selected tree-based classification</article-title>. <source><italic>IEEE Transactions on Cybernetics</italic></source>, <volume>43</volume>(<issue>6</issue>): <fpage>1990</fpage>–<lpage>2004</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1109/TSMCB.2012.2237394" xlink:type="simple">https://doi.org/10.1109/TSMCB.2012.2237394</ext-link></mixed-citation>
</ref>
<ref id="j_jds1188_ref_010">
<mixed-citation publication-type="journal"> <string-name><surname>Friedman</surname> <given-names>J</given-names></string-name>, <string-name><surname>Hastie</surname> <given-names>T</given-names></string-name>, <string-name><surname>Tibshirani</surname> <given-names>R</given-names></string-name> (<year>2010</year>). <article-title>Regularization paths for generalized linear models via coordinate descent</article-title>. <source><italic>Journal of Statistical Software</italic></source>, <volume>33</volume>(<issue>1</issue>): <fpage>1</fpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.18637/jss.v033.i01" xlink:type="simple">https://doi.org/10.18637/jss.v033.i01</ext-link></mixed-citation>
</ref>
<ref id="j_jds1188_ref_011">
<mixed-citation publication-type="journal"> <string-name><surname>Hoskuldsson</surname> <given-names>A</given-names></string-name> (<year>1988</year>). <article-title>PLS regression methods</article-title>. <source><italic>Journal of Chemometrics</italic></source>, <volume>2</volume>: <fpage>211</fpage>–<lpage>228</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1002/cem.1180020306" xlink:type="simple">https://doi.org/10.1002/cem.1180020306</ext-link></mixed-citation>
</ref>
<ref id="j_jds1188_ref_012">
<mixed-citation publication-type="journal"> <string-name><surname>Hoskuldsson</surname> <given-names>A</given-names></string-name> (<year>1992</year>). <article-title>The h-principle in modelling with applications to chemometrics</article-title>. <source><italic>Chemometrics and Intelligent Laboratory Systems</italic></source>, <volume>14</volume>: <fpage>139</fpage>–<lpage>153</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1016/0169-7439(92)80099-P" xlink:type="simple">https://doi.org/10.1016/0169-7439(92)80099-P</ext-link></mixed-citation>
</ref>
<ref id="j_jds1188_ref_013">
<mixed-citation publication-type="journal"> <string-name><surname>Hutter</surname> <given-names>C</given-names></string-name>, <string-name><surname>Zenklusen</surname> <given-names>JC</given-names></string-name> (<year>2018</year>). <article-title>The Cancer Genome Atlas: Creating lasting value beyond its data</article-title>. <source><italic>Cell</italic></source>, <volume>173</volume>(<issue>2</issue>): <fpage>283</fpage>–<lpage>285</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1016/j.cell.2018.03.042" xlink:type="simple">https://doi.org/10.1016/j.cell.2018.03.042</ext-link></mixed-citation>
</ref>
<ref id="j_jds1188_ref_014">
<mixed-citation publication-type="journal"> <string-name><surname>Johnstone</surname> <given-names>IM</given-names></string-name>, <string-name><surname>Silverman</surname> <given-names>BW</given-names></string-name> (<year>2004</year>). <article-title>Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences</article-title>. <source><italic>The Annals of Statistics</italic></source>, <volume>32</volume>(<issue>4</issue>): <fpage>1594</fpage>–<lpage>1649</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1214/009053604000000030" xlink:type="simple">https://doi.org/10.1214/009053604000000030</ext-link></mixed-citation>
</ref>
<ref id="j_jds1188_ref_015">
<mixed-citation publication-type="journal"> <string-name><surname>Lê Cao</surname> <given-names>KA</given-names></string-name>, <string-name><surname>Rossouw</surname> <given-names>D</given-names></string-name>, <string-name><surname>Robert-Granié</surname> <given-names>C</given-names></string-name>, <string-name><surname>Besse</surname> <given-names>P</given-names></string-name> (<year>2008</year>). <article-title>A sparse PLS for variable selection when integrating omics data</article-title>. <source><italic>Statistical Applications in Genetics and Molecular Biology</italic></source>. <volume>7</volume>(<issue>1</issue>): Article <elocation-id>35</elocation-id>.</mixed-citation>
</ref>
<ref id="j_jds1188_ref_016">
<mixed-citation publication-type="journal"> <string-name><surname>Lin</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>M</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>D</given-names></string-name> (<year>2015</year>). <article-title>Generalized orthogonal components regression for high dimensional generalized linear models</article-title>. <source><italic>Computational Statistics &amp; Data Analysis</italic></source>, <volume>88</volume>: <fpage>119</fpage>–<lpage>127</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1016/j.csda.2015.02.006" xlink:type="simple">https://doi.org/10.1016/j.csda.2015.02.006</ext-link></mixed-citation>
</ref>
<ref id="j_jds1188_ref_017">
<mixed-citation publication-type="journal"> <string-name><surname>Loh</surname> <given-names>WY</given-names></string-name> (<year>2011</year>). <article-title>Classification and regression trees</article-title>. <source><italic>Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery</italic></source>, <volume>1</volume>(<issue>1</issue>): <fpage>14</fpage>–<lpage>23</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1188_ref_018">
<mixed-citation publication-type="journal"> <string-name><surname>Massy</surname> <given-names>WF</given-names></string-name> (<year>1965</year>). <article-title>Principal components regression in exploratory statistical research</article-title>. <source><italic>Journal of the American Statistical Association</italic></source>, <volume>60</volume>(<issue>309</issue>): <fpage>234</fpage>–<lpage>256</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1080/01621459.1965.10480787" xlink:type="simple">https://doi.org/10.1080/01621459.1965.10480787</ext-link></mixed-citation>
</ref>
<ref id="j_jds1188_ref_019">
<mixed-citation publication-type="book"> <string-name><surname>McLachlan</surname> <given-names>GJ</given-names></string-name> (<year>2005</year>). <source><italic>Discriminant Analysis and Statistical Pattern Recognition</italic></source>. <publisher-name>John Wiley &amp; Sons</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1188_ref_020">
<mixed-citation publication-type="book"> <string-name><surname>Nguyen</surname> <given-names>DV</given-names></string-name>, <string-name><surname>Rocke</surname> <given-names>DM</given-names></string-name> (<year>2002</year>a). <source><italic>Classification of Acute Leukemia Based on DNA Microarray Gene Expressions Using Partial Least Squares</italic></source>. <publisher-name>Springer</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1188_ref_021">
<mixed-citation publication-type="journal"> <string-name><surname>Nguyen</surname> <given-names>DV</given-names></string-name>, <string-name><surname>Rocke</surname> <given-names>DM</given-names></string-name> (<year>2002</year>b). <article-title>Tumor classification by partial least squares using microarray gene expression data</article-title>. <source><italic>Bioinformatics</italic></source>, <volume>18</volume>: <fpage>39</fpage>–<lpage>50</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1093/bioinformatics/18.1.39" xlink:type="simple">https://doi.org/10.1093/bioinformatics/18.1.39</ext-link></mixed-citation>
</ref>
<ref id="j_jds1188_ref_022">
<mixed-citation publication-type="journal"> <string-name><surname>Shen</surname> <given-names>J</given-names></string-name>, <string-name><surname>Gao</surname> <given-names>S</given-names></string-name> (<year>2008</year>). <article-title>A solution to separation and multicollinearity in multiple logistic regression</article-title>. <source><italic>Journal of Data Science</italic></source>, <volume>6</volume>(<issue>4</issue>): <fpage>515</fpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.6339/JDS.2008.06(4).395" xlink:type="simple">https://doi.org/10.6339/JDS.2008.06(4).395</ext-link></mixed-citation>
</ref>
<ref id="j_jds1188_ref_023">
<mixed-citation publication-type="journal"> <string-name><surname>Tam</surname> <given-names>V</given-names></string-name>, <string-name><surname>Patel</surname> <given-names>N</given-names></string-name>, <string-name><surname>Turcotte</surname> <given-names>M</given-names></string-name>, <string-name><surname>Bossé</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Paré</surname> <given-names>G</given-names></string-name>, <string-name><surname>Meyre</surname> <given-names>D</given-names></string-name> (<year>2019</year>). <article-title>Benefits and limitations of genome-wide association studies</article-title>. <source><italic>Nature Reviews. Genetics</italic></source>, <volume>20</volume>(<issue>8</issue>): <fpage>467</fpage>–<lpage>484</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1038/s41576-019-0127-1" xlink:type="simple">https://doi.org/10.1038/s41576-019-0127-1</ext-link></mixed-citation>
</ref>
<ref id="j_jds1188_ref_024">
<mixed-citation publication-type="journal"> <string-name><surname>Tibshirani</surname> <given-names>R</given-names></string-name> (<year>1996</year>). <article-title>Regression shrinkage and selection via the lasso</article-title>. <source><italic>Journal of the Royal Statistical Society Series B</italic></source>, <volume>58</volume>: <fpage>267</fpage>–<lpage>288</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1111/j.2517-6161.1996.tb02080.x" xlink:type="simple">https://doi.org/10.1111/j.2517-6161.1996.tb02080.x</ext-link></mixed-citation>
</ref>
<ref id="j_jds1188_ref_025">
<mixed-citation publication-type="journal"> <string-name><surname>Van de Geer</surname> <given-names>SA</given-names></string-name> (<year>2008</year>). <article-title>High-dimensional generalized linear models and the lasso</article-title>. <source><italic>The Annals of Statistics</italic></source>, <volume>36</volume>(<issue>2</issue>): <fpage>614</fpage>–<lpage>645</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1188_ref_026">
<mixed-citation publication-type="journal"> <string-name><surname>Velliangiri</surname> <given-names>S</given-names></string-name>, <string-name><surname>Alagumuthukrishnan</surname> <given-names>S</given-names></string-name>, <etal>et al.</etal> (<year>2019</year>). <article-title>A review of dimensionality reduction techniques for efficient computation</article-title>. <source><italic>Procedia Computer Science</italic></source>, <volume>165</volume>: <fpage>104</fpage>–<lpage>111</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1016/j.procs.2020.01.079" xlink:type="simple">https://doi.org/10.1016/j.procs.2020.01.079</ext-link></mixed-citation>
</ref>
<ref id="j_jds1188_ref_027">
<mixed-citation publication-type="chapter"> <string-name><surname>Wold</surname> <given-names>H</given-names></string-name> (<year>1966</year>). <chapter-title>Estimation of principal components and related models by iterative least squares</chapter-title>. In <string-name><given-names>PR</given-names> <surname>Krishnajad</surname></string-name> (Ed.), <source><italic>Multivariate Analysis</italic></source>, <fpage>391</fpage>–<lpage>420</lpage>. <publisher-loc>New York</publisher-loc>: <publisher-name>Academic Press</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1188_ref_028">
<mixed-citation publication-type="journal"> <string-name><surname>Wold</surname> <given-names>H</given-names></string-name> (<year>1975</year>). <article-title>Soft modelling by latent variables: The non-linear iterative partial least squares (nipals) approach</article-title>. <source><italic>Journal of Applied Probability</italic></source>, <volume>12</volume>(<issue>S1</issue>): <fpage>117</fpage>–<lpage>142</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1017/S0021900200047604" xlink:type="simple">https://doi.org/10.1017/S0021900200047604</ext-link></mixed-citation>
</ref>
<ref id="j_jds1188_ref_029">
<mixed-citation publication-type="journal"> <string-name><surname>Xie</surname> <given-names>J</given-names></string-name>, <string-name><surname>Lin</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Yan</surname> <given-names>X</given-names></string-name>, <string-name><surname>Tang</surname> <given-names>N</given-names></string-name> (<year>2020</year>). <article-title>Category-adaptive variable screening for ultra-high dimensional heterogeneous categorical data</article-title>. <source><italic>Journal of the American Statistical Association</italic></source>, <volume>115</volume>(<issue>530</issue>): <fpage>747</fpage>–<lpage>760</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1080/01621459.2019.1573734" xlink:type="simple">https://doi.org/10.1080/01621459.2019.1573734</ext-link></mixed-citation>
</ref>
<ref id="j_jds1188_ref_030">
<mixed-citation publication-type="journal"> <string-name><surname>Zhang</surname> <given-names>C</given-names></string-name> (<year>2010</year>). <article-title>Nearly unbiased variable selection under minimax concave penalty</article-title>. <source><italic>The Annals of Statistics</italic></source>, <volume>38</volume>: <fpage>894</fpage>–<lpage>942</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1188_ref_031">
<mixed-citation publication-type="journal"> <string-name><surname>Zhang</surname> <given-names>D</given-names></string-name>, <string-name><surname>Lin</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>M</given-names></string-name> (<year>2009</year>). <article-title>Penalized orthogonal-components regression for large p small n data</article-title>. <source><italic>Electronic Journal of Statistics</italic></source>, <volume>3</volume>: <fpage>781</fpage>–<lpage>796</lpage>.</mixed-citation>
</ref>
</ref-list>
</back>
</article>
