Journal of Data Science logo


Login Register

  1. Home
  2. To appear
  3. SPA: Signflip Parallel Analysis to Optim ...

Journal of Data Science

Submit your article Information
  • Article info
  • More
    Article info

SPA: Signflip Parallel Analysis to Optimize the Number of Principal Components in Two-dimensional PCA
Zhaoyuan Li ORCID icon link to view author Zhaoyuan Li details   Yiling Kuang  

Authors

 
Placeholder
https://doi.org/10.6339/24-JDS1158
Pub. online: 22 November 2024      Type: Statistical Data Science      Open accessOpen Access

Received
23 July 2024
Accepted
20 October 2024
Published
22 November 2024

Abstract

Yang et al. (2004) developed the two-dimensional principal component analysis (2DPCA) for image representation and recognition, widely used in different fields, including face recognition, biometrics recognition, cancer diagnosis, tumor classification, and others. 2DPCA has been proven to perform better and computationally more efficiently than traditional principal component analysis (PCA). However, some theoretical properties of 2DPCA are still unknown, including determining the number of principal components (PCs) in the training set, which is the critical step in applying 2DPCA. Without rigorous criteria for determining the number of PCs hampers the generalization of the application of 2DPCA. Given this issue, we propose a new method based on parallel analysis to determine the number of PCs in 2DPCA with statistical justification. Several image classification experiments demonstrate that the proposed method compares favourably to other state-of-the-art approaches regarding recognition accuracy and storage requirement, with a low computational cost.

Supplementary material

 Supplementary Material
The supplementary material contains a zipped folder, which contains codes and three data sets for reproducing all results. Please go to https://figshare.com/s/824176b60a12b8ee0535.

References

 
Ahn SC, Horenstein AR (2013). Eigenvalue ratio test for the number of factors. Econometrica, 81(3): 1203–1227. https://doi.org/10.3982/ECTA8968
 
Bai Z, Silverstein JW (2010). Spectral Analysis of Large Dimensional Random Matrices. Springer, New York.
 
Buja A, Eyuboglu N (1992). Remarks on parallel analysis. Multivariate Behavioral Research, 27(4): 509–540. https://doi.org/10.1207/s15327906mbr2704_2
 
Cattell RB, Vogelmann S (1977). A comprehensive trial of the scree and KG criteria for determining the number of factors. Multivariate Behavioral Research, 12(3): 289–325. https://doi.org/10.1207/s15327906mbr1203_2
 
Dhahri H, Al Maghayreh E, Mahmood A, Elkilani W, Faisal Nagi M (2019). Automated breast cancer diagnosis based on machine learning algorithms. Journal of Healthcare Engineering, 2019(1): 4253641.
 
Ejaz MS, Islam MR, Sifatullah M, Sarker A (2019). Implementation of principal component analysis on masked and non-masked face recognition. In: 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT), 1–5. IEEE.
 
Georghiades AS, Belhumeur PN, Kriegman DJ (2001). From few to many: illumination cone models for face recognition under variable lighting and pose. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6): 643–660. https://doi.org/10.1109/34.927464
 
Gumaei A, Hassan MM, Hassan MR, Alelaiwi A, Fortino G (2019). A hybrid feature extraction method with regularized extreme learning machine for brain tumor classification. IEEE Access, 7: 36266–36273. https://doi.org/10.1109/ACCESS.2019.2904145
 
Hair JF Jr, Anderson RE, Tatham RL (1986). Multivariate Data Analysis with Readings. Macmillan Publishing Co., Inc.
 
Hong D, Sheng Y, Dobriban E (2020). Selecting the number of components in PCA via random signflips. arXiv preprint arXiv:2012.02985.
 
Horn JL (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, 30: 179–185. https://doi.org/10.1007/BF02289447
 
Lam C, Yao Q (2012). Factor modeling for high-dimensional time series: inference for the number of factors. The Annals of Statistics, 694–726.
 
Onatski A (2010). Determining the number of factors from empirical distribution of eigenvalues. Review of Economics and Statistics, 92(4): 1004–1016. https://doi.org/10.1162/REST_a_00043
 
Owen AB, Wang J (2016). Bi-cross-validation for factor analysis. Statistical Science, 31(1): 119–139. https://doi.org/10.1214/15-STS539
 
Steven Eyobu O, Han DS (2018). Feature representation and data augmentation for human activity classification based on wearable IMU sensor data using a deep LSTM neural network. Sensors, 18(9): 2892. https://doi.org/10.3390/s18092892
 
Turk MA, Pentland AP (1991). Face recognition using eigenfaces. In: Proceedings of 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 586–587. IEEE Computer Society.
 
Uddin MP, Mamun MA, Hossain MA (2021). PCA-based feature reduction for hyperspectral remote sensing image classification. IETE Technical Review, 38(4): 377–396. https://doi.org/10.1080/02564602.2020.1740615
 
Wan S, Xia Y, Qi L, Yang YH, Atiquzzaman M (2020). Automated colorization of a grayscale image with seed points propagation. IEEE Transactions on Multimedia, 22(7): 1756–1768. https://doi.org/10.1109/TMM.2020.2976573
 
Wang H (2012). Factor profiled sure independence screening. Biometrika, 99(1): 15–28. https://doi.org/10.1093/biomet/asr074
 
Wang P, Li Z, Wei Z, Wu T, Luo C, Jiang W, et al. (2024). Space-time-coding digital metasurface element design based on state recognition and mapping methods with CNN-LSTM-DNN. IEEE Transactions on Antennas and Propagation, 72(6): 4962–4975. https://doi.org/10.1109/TAP.2024.3349778
 
Wang Q, Gao Q, Gao X, Nie F (2017). Optimal mean two-dimensional principal component analysis with F-norm minimization. Pattern Recognition, 68: 286–294. https://doi.org/10.1016/j.patcog.2017.03.026
 
Yang J, Zhang D, Frangi AF, Jy Y (2004). Two-dimensional PCA: a new approach to appearance-based face representation and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(1): 131–137. https://doi.org/10.1109/TPAMI.2004.1261097
 
Yang W, Wang S, Hu J, Tao X, Li Y (2024). Feature extraction and learning approaches for cancellable biometrics: a survey. CAAI Transactions on Intelligence Technology, 9(1): 4–25. https://doi.org/10.1049/cit2.12283
 
Yilmaz A, Gokmen M (2000). Eigenhill vs. eigenface and eigenedge. In: Proceedings 15th International Conference on Pattern Recognition. ICPR-2000, volume 2, 827–830. IEEE.
 
Zabalza J, Ren J, Yang M, Zhang Y, Wang J, Marshall S, et al. (2014). Novel folded-PCA for improved feature extraction and data reduction with hyperspectral imaging and SAR in remote sensing. ISPRS Journal of Photogrammetry and Remote Sensing, 93: 112–122. https://doi.org/10.1016/j.isprsjprs.2014.04.006
 
Zeng X, Wang X, Xie Y (2024). Multiple pseudo-siamese network with supervised contrast learning for medical multi-modal retrieval. ACM Transactions on Multimedia Computing Communications and Applications, 20(5): 1–23.

PDF XML
PDF XML

Copyright
2024 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.
by logo by logo
Open access article under the CC BY license.

Keywords
2DPCA feature extraction image analysis

Funding
Zhaoyuan Li’s research is partially supported by National Natural Science Foundation of China (No. 11901492) and Shenzhen Science and Technology Program (ZDSYS 20211021111415025).

Metrics
since February 2021
118

Article info
views

38

PDF
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

Journal of data science

  • Online ISSN: 1683-8602
  • Print ISSN: 1680-743X

About

  • About journal

For contributors

  • Submit
  • OA Policy
  • Become a Peer-reviewer

Contact us

  • JDS@ruc.edu.cn
  • No. 59 Zhongguancun Street, Haidian District Beijing, 100872, P.R. China
Powered by PubliMill  •  Privacy policy