Journal of Data Science logo


Login Register

  1. Home
  2. Issues
  3. Volume 19, Issue 1 (2021)
  4. Validation of Stepwise-Based Procedure i ...

Journal of Data Science

Submit your article Information
  • Article info
  • Related articles
  • More
    Article info Related articles

Validation of Stepwise-Based Procedure in GAMLSS
Volume 19, Issue 1 (2021), pp. 96–110
Thiago G. Ramires   Luiz R. Nakamura ORCID icon link to view author Luiz R. Nakamura details   Ana J. Righetto     All authors (7)

Authors

 
Placeholder
https://doi.org/10.6339/21-JDS1003
Pub. online: 10 February 2021      Type: Statistical Data Science     

Received
1 November 2020
Accepted
1 January 2021
Published
10 February 2021

Abstract

One of the key features in regression models consists in selecting appropriate characteristics that explain the behavior of the response variable, in which stepwise-based procedures occupy a prominent position. In this paper we performed several simulation studies to investigate whether a specific stepwise-based approach, namely Strategy A, properly selects authentic variables into the generalized additive models for location, scale and shape framework, considering Gaussian, zero inflated Poisson and Weibull distributions. Continuous (with linear and nonlinear relationships) and categorical explanatory variables are considered and they are selected through some goodness-of-fit statistics. Overall, we conclude that the Strategy A greatly performed.

Supplementary material

 Supplementary Material
Please note that the following supplementary files are available online: i) suppl_stepgaic.pdf: p-values for the selected variables in each simulated scenario; and ii) codes_stepgaic.zip: all codes in R software that were used to conduct the simulation studies presented in this paper.

References

 
Akaike H (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19: 716–723.
 
Ayuso SV, Oñatibia GR, Maestre FT, Yahdjian L (2020). Grazing pressure interacts with aridity to determine the development and diversity of biological soil crusts in Patagonian rangelands. Land Degradation & Development, 31: 488–499.
 
Cole TJ, Green PJ (1992). Smoothing reference centile curves: The LMS method and penalized likelihood. Statistics in Medicine, 11: 1305–1319.
 
De Bastiani F, Rigby RA, Stasinopoulos DM, Cysneiros AHMA, Uribe-Opazo M (2018). Gaussian Markov random field spatial models in gamlss. Journal of Applied Statistics, 45: 168–186.
 
Eilers PH, Marx BD (1996). Flexible smoothing with b-splines and penalties. Statistical Science, 11: 89–121.
 
Eilers PH, Marx BD, Durbán M (2015). Twenty years of p-splines. SORT, 39: 149–186.
 
Hastie TJ, Tibshirani RJ (1990). Generalized Additive Models. Chapman and Hall/CRC.
 
Hofner B, Mayr A, Schmid M (2016). gamboostLSS: An R package for model building and variable selection in the GAMLSS framework. Journal of Statistical Software, 74: 1–31.
 
Hossain A, Rigby RA, Stasinopoulos DM, Enea M (2016). Centile estimation for a proportion response variable. Statistics in Medicine, 35: 859–904.
 
Kneib T (2013). Beyond mean regression. Statistical Modelling, 13: 275–303.
 
Lee JD, Sun DL, Sun Y, Taylor J (2016). Exact post-selection inference, with application to the lasso. The Annals of Statistics, 44: 907–927.
 
Lee Y, Nelder JA, Pawitan Y (2006). Generalized Linear Models with Random Effects: Unified Analysis via H-likelihood. Chapman and Hall/CRC.
 
Leroy B, Peatman T, Usu T, Caillot S, Moore B, Williams A, et al. (2016). Interactions between artisanal and industrial tuna fisheries: Insights from a decade of tagging experiments. Marine Policy, 65: 11–19.
 
Mayr A, Fenske N, Hofner B, Kneib T, Schmid M (2012). Generalized additive models for location, scale and shape for high dimensional data – a flexible approach based on boosting. Journal of the Royal Statistical Society. Series C. Applied Statistics, 61: 403–427.
 
Nakamura LR, Cerqueira PHR, Ramires TG, Pescim RR, Rigby RA, Stasinopoulos DM (2019). A new continuous distribution on the unit interval applied to modelling the points ratio of football teams. Journal of Applied Statistics, 46: 416–431.
 
Nakamura LR, Rigby RA, Stasinopoulos DM, Leandro RA, Villegas C, Pescim RR (2017). Modelling location, scale and shape parameters of the Birnbaum-Saunders generalized t distribution. Journal of Data Science, 15: 221–237.
 
Nelder JA, Wedderburn RWM (1972). Generalized linear models. Journal of the Royal Statistical Society. Series A. General, 135: 370–384.
 
R Core Team (2020). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
 
Ramires TG, Nakamura LR, Righetto AJ, Ortega EMM, Cordeiro GM (2018). Predicting survival function and identifying associated factors in patients with renal insufficiency in the metropolitan area of Maringá, Paraná state, Brazil. Cadernos de Saúde Pública, 34: 1–13.
 
Ramires TG, Nakamura LR, Righetto RR, Pescim AJ, Mazucheli J, Cordeiro GM (2019). A new semiparametric Weibull cure rate model: Fitting different behaviors within GAMLSS. Journal of Applied Statistics, 46: 2744–2760.
 
Rigby RA, Stasinopoulos DM (2005). Generalized additive models for location, scale and shape. Journal of the Royal Statistical Society. Series C. Applied Statistics, 54: 507–554.
 
Rigby RA, Stasinopoulos DM (2014). Automatic smoothing parameter selection in GAMLSS with an application to centile estimation. Statistical Methods in Medical Research, 23: 318–332.
 
Rigby RA, Stasinopoulos DM, Heller GZ, De Bastiani F (2019). Distributions for Modeling Location, Scale and Shape: Using GAMLSS in R. Chapman and Hall/CRC.
 
Righetto AJ, Ramires TG, Nakamura L, Castanho PLDB, Faes C, Savian TV (2019). Predicting weed invasion in a sugarcane cultivar using multispectral image. Journal of Applied Statistics, 46: 1–12.
 
Schwarz G (1978). Estimating the dimension of a model. The Annals of Statistics, 6: 461–464.
 
Stanton J (2001). Galton, Pearson, and the peas: A brief history of linear regression for statistics instructors. Journal of Statistics Education, 9: 1–13.
 
Stasinopoulos DM, Rigby RA (2007). Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software, 23: 1–46.
 
Stasinopoulos DM, Rigby RA, De Bastiani F (2018). GAMLSS: A distributional regression approach. Statistical Modelling, 18: 1–26.
 
Stasinopoulos DM, Rigby RA, Heller GZ, Voudouris V, De Bastiani F (2017). Flexible Regression and Smoothing: Using GAMLSS in R. Chapman and Hall/CRC.
 
Voncken L, Albers CJ, Timmerman ME (2019). Model selection in continuous test norming with GAMLSS. Assessment, 26: 1329–1346.
 
Voudouris V, Gilchrist R, Rigby R, Sedwick J, Stasinopoulos DM (2012). Modelling skewness and kurtosis with the BCPE density in GAMLSS. Journal of Applied Statistics, 39: 1279–1293.

Related articles PDF XML
Related articles PDF XML

Copyright
© 2021 The Author(s).
This is a free to read article.

Keywords
backward forward model selection smoothing

Metrics
since February 2021
1663

Article info
views

840

PDF
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

Journal of data science

  • Online ISSN: 1683-8602
  • Print ISSN: 1680-743X

About

  • About journal

For contributors

  • Submit
  • OA Policy
  • Become a Peer-reviewer

Contact us

  • JDS@ruc.edu.cn
  • No. 59 Zhongguancun Street, Haidian District Beijing, 100872, P.R. China
Powered by PubliMill  •  Privacy policy