<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">JDS</journal-id>
<journal-title-group><journal-title>Journal of Data Science</journal-title></journal-title-group>
<issn pub-type="epub">1683-8602</issn><issn pub-type="ppub">1680-743X</issn><issn-l>1680-743X</issn-l>
<publisher>
<publisher-name>School of Statistics, Renmin University of China</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">JDS1183</article-id>
<article-id pub-id-type="doi">10.6339/25-JDS1183</article-id>
<article-categories><subj-group subj-group-type="heading">
<subject>Computing in Data Science</subject></subj-group></article-categories>
<title-group>
<article-title>The Journey to Improve LCPM: An R Package for Ordinal Regression</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>DePratti</surname><given-names>Roland</given-names></name><email xlink:href="mailto:roland.depratti@my.ccsu.edu">roland.depratti@my.ccsu.edu</email><xref ref-type="aff" rid="j_jds1183_aff_001">1</xref><xref ref-type="corresp" rid="cor1">∗</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Singh</surname><given-names>Gurbakhshash</given-names></name><xref ref-type="aff" rid="j_jds1183_aff_001">1</xref>
</contrib>
<aff id="j_jds1183_aff_001"><label>1</label><institution>Central Connecticut State University</institution>, New Britain, Connecticut, <country>USA</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>∗</label>Corresponding author. Email: <ext-link ext-link-type="uri" xlink:href="mailto:roland.depratti@my.ccsu.edu">roland.depratti@my.ccsu.edu</ext-link>.</corresp>
</author-notes>
<pub-date pub-type="ppub"><year>2025</year></pub-date><pub-date pub-type="epub"><day>5</day><month>5</month><year>2025</year></pub-date><volume>23</volume><issue>2</issue><fpage>399</fpage><lpage>415</lpage><supplementary-material id="S1" content-type="archive" xlink:href="jds1183_s001.zip" mimetype="application" mime-subtype="x-zip-compressed">
<caption>
<title>Supplementary Material</title>
<p>The supplementary zip file contains all source code (both R and CPP), an R package used to run test case D, a sample windows batch script to run the code and 3 sample csv files that are input into the batch file. There is also a readme file included with more explanation on how to use these files.</p>
</caption>
</supplementary-material><history><date date-type="received"><day>3</day><month>10</month><year>2024</year></date><date date-type="accepted"><day>4</day><month>4</month><year>2025</year></date></history>
<permissions><copyright-statement>2025 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.</copyright-statement><copyright-year>2025</copyright-year>
<license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>Recently, the log cumulative probability model (LCPM) and its special case the proportional probability model (PPM) was developed to relate ordinal outcomes to predictor variables using the log link instead of the logit link. These models permit the estimation of probability instead of odds, but the log link requires constrained maximum likelihood estimation (cMLE). An algorithm that efficiently handles cMLE for the LCPM is a valuable resource as these models are applicable in many settings and its output is easy to interpret. One such implementation is in the R package <monospace>lcpm</monospace>. In this era of big data, all statistical models are under pressure to meet the new processing demands. This work aimed to improve the algorithm in <monospace>R</monospace> package <monospace>lcpm</monospace> to process more input in less time using less memory.</p>
</abstract>
<kwd-group>
<label>Keywords</label>
<kwd>constrained maximum likelihood estimation</kwd>
<kwd>log link</kwd>
<kwd>ordinal outcomes</kwd>
<kwd>proportional probability model</kwd>
</kwd-group>
</article-meta>
</front>
<back>
<ref-list id="j_jds1183_reflist_001">
<title>References</title>
<ref id="j_jds1183_ref_001">
<mixed-citation publication-type="journal"> <string-name><surname>Albert</surname> <given-names>A</given-names></string-name>, <string-name><surname>Anderson</surname> <given-names>JA</given-names></string-name> (<year>1984</year>). <article-title>On the existence of maximum likelihood estimates in logistic regression models</article-title>. <source><italic>Biometrika</italic></source>, <volume>71</volume>(<issue>1</issue>): <fpage>1</fpage>–<lpage>10</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1093/biomet/71.1.1" xlink:type="simple">https://doi.org/10.1093/biomet/71.1.1</ext-link></mixed-citation>
</ref>
<ref id="j_jds1183_ref_002">
<mixed-citation publication-type="other"> Amazon (<year>2025</year>). AWS ParallelCluster. <uri>https://docs.aws.amazon.com/parallelcluster/latest/ug/cloudformation-v3.html</uri>. Accessed 02-13-2025.</mixed-citation>
</ref>
<ref id="j_jds1183_ref_003">
<mixed-citation publication-type="other"> <string-name><surname>Andrade</surname> <given-names>B</given-names></string-name> (<year>2019</year>). lbreg: Log-binomial regression with constrained optimization. <uri>https://CRAN.R-project.org/package=lbreg</uri>. R package version 1.3.</mixed-citation>
</ref>
<ref id="j_jds1183_ref_004">
<mixed-citation publication-type="journal"> <string-name><surname>Blizzard</surname> <given-names>CL</given-names></string-name>, <string-name><surname>Quinn</surname> <given-names>SJ</given-names></string-name>, <string-name><surname>Canary</surname> <given-names>JD</given-names></string-name>, <string-name><surname>Hosmer</surname> <given-names>DW</given-names></string-name> (<year>2013</year>). <article-title>Log-link regression models for ordinal responses</article-title>. <source><italic>Open Journal of Statistics</italic></source>, <volume>3</volume>: <fpage>16</fpage>–<lpage>25</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.4236/ojs.2013.34A003" xlink:type="simple">https://doi.org/10.4236/ojs.2013.34A003</ext-link></mixed-citation>
</ref>
<ref id="j_jds1183_ref_005">
<mixed-citation publication-type="other"> <string-name><surname>Clore</surname> <given-names>J</given-names></string-name>, <string-name><surname>Cios</surname> <given-names>K</given-names></string-name>, <string-name><surname>DeShazo</surname> <given-names>J</given-names></string-name>, <string-name><surname>Strack</surname> <given-names>B</given-names></string-name> (<year>2014</year>). Diabetes 130-US hospitals for years 1999–2008. UCI Machine Learning Repository. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.24432/C5230J" xlink:type="simple">https://doi.org/10.24432/C5230J</ext-link>.</mixed-citation>
</ref>
<ref id="j_jds1183_ref_006">
<mixed-citation publication-type="other"> <string-name><surname>Halevy</surname> <given-names>A</given-names></string-name>, <string-name><surname>Norvig</surname> <given-names>P</given-names></string-name>, <string-name><surname>Pereira</surname> <given-names>F</given-names></string-name> (<year>2009</year>). The unreasonable effectiveness of data. Google Research. <uri>https://static.googleusercontent.com/media/research.google.com/en//archive/people/peter/papers/UnreasonableEffectivenessOfData.pdf</uri>.</mixed-citation>
</ref>
<ref id="j_jds1183_ref_007">
<mixed-citation publication-type="journal"> <string-name><surname>Lange</surname> <given-names>K</given-names></string-name> (<year>1994</year>). <article-title>An adaptive barrier method for convex programming</article-title>. <source><italic>Methods and Applications of Analysis</italic></source>, <volume>1</volume>(<issue>4</issue>): <fpage>392</fpage>–<lpage>402</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.4310/MAA.1994.v1.n4.a1" xlink:type="simple">https://doi.org/10.4310/MAA.1994.v1.n4.a1</ext-link></mixed-citation>
</ref>
<ref id="j_jds1183_ref_008">
<mixed-citation publication-type="journal"> <string-name><surname>Luo</surname> <given-names>J</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>J</given-names></string-name>, <string-name><surname>Sun</surname> <given-names>H</given-names></string-name> (<year>2014</year>). <article-title>Estimation of relative risk using a log-binomial model with constraints</article-title>. <source><italic>Computational Statistics</italic></source>, <volume>29</volume>: <fpage>981</fpage>–<lpage>1003</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1007/s00180-013-0476-8" xlink:type="simple">https://doi.org/10.1007/s00180-013-0476-8</ext-link></mixed-citation>
</ref>
<ref id="j_jds1183_ref_009">
<mixed-citation publication-type="journal"> <string-name><surname>McCullagh</surname> <given-names>P</given-names></string-name> (<year>1980</year>). <article-title>Regression models for ordinal data</article-title>. <source><italic>Journal of the Royal Statistical Society, Series B (Methodological)</italic></source>, <volume>42</volume>(<issue>2</issue>): <fpage>109</fpage>–<lpage>142</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1111/j.2517-6161.1980.tb01109.x" xlink:type="simple">https://doi.org/10.1111/j.2517-6161.1980.tb01109.x</ext-link></mixed-citation>
</ref>
<ref id="j_jds1183_ref_010">
<mixed-citation publication-type="book"> <collab>R Core Team</collab> (<year>2021</year>). <source><italic>R: A Language and Environment for Statistical Computing</italic></source>. <publisher-name>R Foundation for Statistical Computing</publisher-name>, <publisher-loc>Vienna, Austria</publisher-loc>. <uri>https://www.R-project.org/</uri>.</mixed-citation>
</ref>
<ref id="j_jds1183_ref_011">
<mixed-citation publication-type="other"> <string-name><surname>Rajaraman</surname> <given-names>A</given-names></string-name> (<year>2008</year>). More data than usual. <uri>https://anand.typepad.com/datawocky/2008/03/more-data-usual.html</uri>. Accessed: 2025-02-14.</mixed-citation>
</ref>
<ref id="j_jds1183_ref_012">
<mixed-citation publication-type="other"> <string-name><surname>Schnoebelen</surname> <given-names>T</given-names></string-name> (<year>2016</year>). More data beats better algorithms. <uri>https://www.datasciencecentral.com/more-data-beats-better-algorithms-by-tyler-schnoebelen/?utm_source=chatgpt.com</uri>. Accessed: 2025-02-08.</mixed-citation>
</ref>
<ref id="j_jds1183_ref_013">
<mixed-citation publication-type="journal"> <string-name><surname>Schwendinger</surname> <given-names>F</given-names></string-name>, <string-name><surname>Grun</surname> <given-names>B</given-names></string-name>, <string-name><surname>Hornik</surname> <given-names>K</given-names></string-name> (<year>2021</year>). <article-title>A comparison of optimization solvers for log binomial regression including conic programming</article-title>. <source><italic>Computational Statistics</italic></source>, <volume>36</volume>: <fpage>1721</fpage>–<lpage>1754</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1007/s00180-021-01084-5" xlink:type="simple">https://doi.org/10.1007/s00180-021-01084-5</ext-link></mixed-citation>
</ref>
<ref id="j_jds1183_ref_014">
<mixed-citation publication-type="journal"> <string-name><surname>Singh</surname> <given-names>G</given-names></string-name>, <string-name><surname>Fick</surname> <given-names>GH</given-names></string-name> (<year>2020</year>a). <article-title>Ordinal outcomes: A cumulative probability model with the log link and an assumption of proportionality</article-title>. <source><italic>Statistics in Medicine</italic></source>, <volume>39</volume>: <fpage>1343</fpage>–<lpage>1361</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1002/sim.8479" xlink:type="simple">https://doi.org/10.1002/sim.8479</ext-link></mixed-citation>
</ref>
<ref id="j_jds1183_ref_015">
<mixed-citation publication-type="other"> <string-name><surname>Singh</surname> <given-names>G</given-names></string-name>, <string-name><surname>Fick</surname> <given-names>GH</given-names></string-name> (<year>2020</year>b). LCPM: Ordinal outcomes: Generalized linear models with the log link. <uri>https://CRAN.R-project.org/package=lcpm</uri>. R package version 0.1.1.</mixed-citation>
</ref>
<ref id="j_jds1183_ref_016">
<mixed-citation publication-type="other"> <string-name><surname>Varadhan</surname> <given-names>R</given-names></string-name> (<year>2023</year>). alabama: Constrained nonlinear optimization. <uri>https://CRAN.R-project.org/package=alabama</uri>. R package version 2023.1.0.</mixed-citation>
</ref>
<ref id="j_jds1183_ref_017">
<mixed-citation publication-type="journal"> <string-name><surname>Williams</surname> <given-names>R</given-names></string-name> (<year>2010</year>). <article-title>Fitting heterogeneous choice models with oglm</article-title>. <source><italic>Stata Journal</italic></source>, <volume>10</volume>(<issue>4</issue>): <fpage>540</fpage>–<lpage>567</lpage>. <uri>http://www.stata-journal.com/article.html?article=st0208.4</uri>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1177/1536867X1101000402" xlink:type="simple">https://doi.org/10.1177/1536867X1101000402</ext-link></mixed-citation>
</ref>
<ref id="j_jds1183_ref_018">
<mixed-citation publication-type="other"> <string-name><surname>Yee</surname> <given-names>TW</given-names></string-name> (<year>2024</year>). VGAM: Vector generalized linear and additive models. <uri>https://CRAN.R-project.org/package=vgam</uri>. R package version 1.1-11.</mixed-citation>
</ref>
</ref-list>
</back>
</article>
