Journal of Data Science logo


Login Register

  1. Home
  2. Issues
  3. Volume 17, Issue 1 (2019)
  4. A Comparison Of Regularized Linear Discr ...

Journal of Data Science

Submit your article Information
  • Article info
  • More
    Article info

A Comparison Of Regularized Linear Discriminant Functions For Poorly-Posed Classification Problems
Volume 17, Issue 1 (2019), pp. 1–36
L. A. Thompson   Wade Davis   Phil D. Young     All authors (5)

Authors

 
Placeholder
https://doi.org/10.6339/JDS.201901_17(1).0001
Pub. online: 4 August 2022      Type: Research Article      Open accessOpen Access

Published
4 August 2022

Abstract

For statistical classification problems where the total sample size is slightly greater than the feature dimension, regularized statistical discriminant rules may reduce classification error rates. We review ten dispersion-matrix regularization approaches, four for the pooled sample covariance matrix, four for the inverse pooled sample covariance matrix, and two for a diagonal covariance matrix, for use in Anderson’s (1951) linear discriminant function (LDF). We compare these regularized classifiers against the traditional LDF for a variety of parameter configurations, and use the estimated expected error rate (EER) to assess performance. We also apply the regularized LDFs to a well-known real-data example on colon cancer. We found that no regularized classifier uniformly outperformed the others. However, we found that the more contemporary classifiers (e.g., Thomaz and Gillies, 2005; Tong et al., 2012; and Xu et al., 2009) tended to outperform the older classifiers, and that certain simple methods (e.g., Pang et al., 2009; Thomaz and Gillies, 2005; and Tong et al., 2012) performed very well, questioning the need for involved cross-validation in estimating regularization parameters. Nonetheless, an older regularized classifier proposed by Smidt and McDonald (1976) yielded consistently low misclassification rates across all scenarios, despite the shape of the true covariance matrix. Finally, our simulations showed that regularized classifiers that relied primarily on asymptotic approximations with respect to the training sample size rarely outperformed the traditional LDF, and are thus not recommended. We discuss our results as they pertain to the effect of high dimension, and offer general guidelines for choosing a regularization method for poorly-posed problems.

PDF XML
PDF XML

Copyright
No copyright data available.

Keywords
Poorly-posed classification problems Shrinkage estimator Eigenvalue adjustment Expected error rate

Metrics
since February 2021
679

Article info
views

416

PDF
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

Journal of data science

  • Online ISSN: 1683-8602
  • Print ISSN: 1680-743X

About

  • About journal

For contributors

  • Submit
  • OA Policy
  • Become a Peer-reviewer

Contact us

  • JDS@ruc.edu.cn
  • No. 59 Zhongguancun Street, Haidian District Beijing, 100872, P.R. China
Powered by PubliMill  •  Privacy policy