Journal of Data Science logo


Login Register

  1. Home
  2. Issues
  3. Volume 12, Issue 1 (2014)
  4. A New Variable Selection Approach Inspir ...

Journal of Data Science

Submit your article Information
  • Article info
  • Related articles
  • More
    Article info Related articles

A New Variable Selection Approach Inspired by Supersaturated Designs Given a Large-Dimensional Dataset
Volume 12, Issue 1 (2014), pp. 35–52
Christina Parpoula   Krystallenia Drosou   Christos Koukouvinos     All authors (4)

Authors

 
Placeholder
https://doi.org/10.6339/JDS.2014.12(1).1183
Pub. online: 4 August 2022      Type: Research Article      Open accessOpen Access

Published
4 August 2022

Abstract

Abstract: The problem of variable selection is fundamental to statistical modelling in diverse fields of sciences. In this paper, we study in particular the problem of selecting important variables in regression problems in the case where observations and labels of a real-world dataset are available. At first, we examine the performance of several existing statistical methods for analyzing a real large trauma dataset which consists of 7000 observations and 70 factors, that include demographic, transport and intrahospital data. The statistical methods employed in this work are the nonconcave penalized likelihood methods (SCAD, LASSO, and Hard), the generalized linear logis tic regression, and the best subset variable selection (with AIC and BIC), used to detect possible risk factors of death. Supersaturated designs (SSDs) are a large class of factorial designs which can be used for screening out the important factors from a large set of potentially active variables. This paper presents a new variable selection approach inspired by supersaturated designs given a dataset of observations. The merits and the effectiveness of this approach for identifying important variables in observational studies are evaluated by considering several two-levels supersaturated designs, and a variety of different statistical models with respect to the combinations of factors and the number of observations. The derived results are encour aging since the alternative approach using supersaturated designs provided specific information that are logical and consistent with the medical experi ence, which may also assist as guidelines for trauma management.

Related articles PDF XML
Related articles PDF XML

Copyright
No copyright data available.

Keywords
Generalized linear model penalized likelihood supersaturated design

Metrics
since February 2021
445

Article info
views

284

PDF
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

Journal of data science

  • Online ISSN: 1683-8602
  • Print ISSN: 1680-743X

About

  • About journal

For contributors

  • Submit
  • OA Policy
  • Become a Peer-reviewer

Contact us

  • JDS@ruc.edu.cn
  • No. 59 Zhongguancun Street, Haidian District Beijing, 100872, P.R. China
Powered by PubliMill  •  Privacy policy