Journal of Data Science logo


Login Register

  1. Home
  2. Issues
  3. Volume 23, Issue 2 (2025): Special Issue: the 2024 Symposium on Data Science and Statistics (SDSS)
  4. The Journey to Improve LCPM: An R Packag ...

Journal of Data Science

Submit your article Information
  • Article info
  • More
    Article info

The Journey to Improve LCPM: An R Package for Ordinal Regression
Volume 23, Issue 2 (2025): Special Issue: the 2024 Symposium on Data Science and Statistics (SDSS), pp. 399–415
Roland DePratti   Gurbakhshash Singh  

Authors

 
Placeholder
https://doi.org/10.6339/25-JDS1183
Pub. online: 5 May 2025      Type: Computing In Data Science      Open accessOpen Access

Received
3 October 2024
Accepted
4 April 2025
Published
5 May 2025

Abstract

Recently, the log cumulative probability model (LCPM) and its special case the proportional probability model (PPM) was developed to relate ordinal outcomes to predictor variables using the log link instead of the logit link. These models permit the estimation of probability instead of odds, but the log link requires constrained maximum likelihood estimation (cMLE). An algorithm that efficiently handles cMLE for the LCPM is a valuable resource as these models are applicable in many settings and its output is easy to interpret. One such implementation is in the R package lcpm. In this era of big data, all statistical models are under pressure to meet the new processing demands. This work aimed to improve the algorithm in R package lcpm to process more input in less time using less memory.

Supplementary material

 Supplementary Material
The supplementary zip file contains all source code (both R and CPP), an R package used to run test case D, a sample windows batch script to run the code and 3 sample csv files that are input into the batch file. There is also a readme file included with more explanation on how to use these files.

References

 
Albert A, Anderson JA (1984). On the existence of maximum likelihood estimates in logistic regression models. Biometrika, 71(1): 1–10. https://doi.org/10.1093/biomet/71.1.1
 
Amazon (2025). AWS ParallelCluster. https://docs.aws.amazon.com/parallelcluster/latest/ug/cloudformation-v3.html. Accessed 02-13-2025.
 
Andrade B (2019). lbreg: Log-binomial regression with constrained optimization. https://CRAN.R-project.org/package=lbreg. R package version 1.3.
 
Blizzard CL, Quinn SJ, Canary JD, Hosmer DW (2013). Log-link regression models for ordinal responses. Open Journal of Statistics, 3: 16–25. https://doi.org/10.4236/ojs.2013.34A003
 
Clore J, Cios K, DeShazo J, Strack B (2014). Diabetes 130-US hospitals for years 1999–2008. UCI Machine Learning Repository. https://doi.org/10.24432/C5230J.
 
Halevy A, Norvig P, Pereira F (2009). The unreasonable effectiveness of data. Google Research. https://static.googleusercontent.com/media/research.google.com/en//archive/people/peter/papers/UnreasonableEffectivenessOfData.pdf.
 
Lange K (1994). An adaptive barrier method for convex programming. Methods and Applications of Analysis, 1(4): 392–402. https://doi.org/10.4310/MAA.1994.v1.n4.a1
 
Luo J, Zhang J, Sun H (2014). Estimation of relative risk using a log-binomial model with constraints. Computational Statistics, 29: 981–1003. https://doi.org/10.1007/s00180-013-0476-8
 
McCullagh P (1980). Regression models for ordinal data. Journal of the Royal Statistical Society, Series B (Methodological), 42(2): 109–142. https://doi.org/10.1111/j.2517-6161.1980.tb01109.x
 
R Core Team (2021). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.
 
Rajaraman A (2008). More data than usual. https://anand.typepad.com/datawocky/2008/03/more-data-usual.html. Accessed: 2025-02-14.
 
Schnoebelen T (2016). More data beats better algorithms. https://www.datasciencecentral.com/more-data-beats-better-algorithms-by-tyler-schnoebelen/?utm_source=chatgpt.com. Accessed: 2025-02-08.
 
Schwendinger F, Grun B, Hornik K (2021). A comparison of optimization solvers for log binomial regression including conic programming. Computational Statistics, 36: 1721–1754. https://doi.org/10.1007/s00180-021-01084-5
 
Singh G, Fick GH (2020a). Ordinal outcomes: A cumulative probability model with the log link and an assumption of proportionality. Statistics in Medicine, 39: 1343–1361. https://doi.org/10.1002/sim.8479
 
Singh G, Fick GH (2020b). LCPM: Ordinal outcomes: Generalized linear models with the log link. https://CRAN.R-project.org/package=lcpm. R package version 0.1.1.
 
Varadhan R (2023). alabama: Constrained nonlinear optimization. https://CRAN.R-project.org/package=alabama. R package version 2023.1.0.
 
Williams R (2010). Fitting heterogeneous choice models with oglm. Stata Journal, 10(4): 540–567. http://www.stata-journal.com/article.html?article=st0208.4. https://doi.org/10.1177/1536867X1101000402
 
Yee TW (2024). VGAM: Vector generalized linear and additive models. https://CRAN.R-project.org/package=vgam. R package version 1.1-11.

PDF XML
PDF XML

Copyright
2025 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.
by logo by logo
Open access article under the CC BY license.

Keywords
constrained maximum likelihood estimation log link ordinal outcomes proportional probability model

Metrics
since February 2021
84

Article info
views

46

PDF
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

Journal of data science

  • Online ISSN: 1683-8602
  • Print ISSN: 1680-743X

About

  • About journal

For contributors

  • Submit
  • OA Policy
  • Become a Peer-reviewer

Contact us

  • JDS@ruc.edu.cn
  • No. 59 Zhongguancun Street, Haidian District Beijing, 100872, P.R. China
Powered by PubliMill  •  Privacy policy