SPA: Signflip Parallel Analysis to Optimize the Number of Principal Components in Two-dimensional PCA
Pub. online: 22 November 2024
Type: Statistical Data Science
Open Access
Received
23 July 2024
23 July 2024
Accepted
20 October 2024
20 October 2024
Published
22 November 2024
22 November 2024
Abstract
Yang et al. (2004) developed the two-dimensional principal component analysis (2DPCA) for image representation and recognition, widely used in different fields, including face recognition, biometrics recognition, cancer diagnosis, tumor classification, and others. 2DPCA has been proven to perform better and computationally more efficiently than traditional principal component analysis (PCA). However, some theoretical properties of 2DPCA are still unknown, including determining the number of principal components (PCs) in the training set, which is the critical step in applying 2DPCA. Without rigorous criteria for determining the number of PCs hampers the generalization of the application of 2DPCA. Given this issue, we propose a new method based on parallel analysis to determine the number of PCs in 2DPCA with statistical justification. Several image classification experiments demonstrate that the proposed method compares favourably to other state-of-the-art approaches regarding recognition accuracy and storage requirement, with a low computational cost.
Supplementary material
Supplementary MaterialThe supplementary material contains a zipped folder, which contains codes and three data sets for reproducing all results. Please go to https://figshare.com/s/824176b60a12b8ee0535.
References
Ahn SC, Horenstein AR (2013). Eigenvalue ratio test for the number of factors. Econometrica, 81(3): 1203–1227. https://doi.org/10.3982/ECTA8968
Buja A, Eyuboglu N (1992). Remarks on parallel analysis. Multivariate Behavioral Research, 27(4): 509–540. https://doi.org/10.1207/s15327906mbr2704_2
Cattell RB, Vogelmann S (1977). A comprehensive trial of the scree and KG criteria for determining the number of factors. Multivariate Behavioral Research, 12(3): 289–325. https://doi.org/10.1207/s15327906mbr1203_2
Georghiades AS, Belhumeur PN, Kriegman DJ (2001). From few to many: illumination cone models for face recognition under variable lighting and pose. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6): 643–660. https://doi.org/10.1109/34.927464
Gumaei A, Hassan MM, Hassan MR, Alelaiwi A, Fortino G (2019). A hybrid feature extraction method with regularized extreme learning machine for brain tumor classification. IEEE Access, 7: 36266–36273. https://doi.org/10.1109/ACCESS.2019.2904145
Hong D, Sheng Y, Dobriban E (2020). Selecting the number of components in PCA via random signflips. arXiv preprint arXiv:2012.02985.
Horn JL (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, 30: 179–185. https://doi.org/10.1007/BF02289447
Onatski A (2010). Determining the number of factors from empirical distribution of eigenvalues. Review of Economics and Statistics, 92(4): 1004–1016. https://doi.org/10.1162/REST_a_00043
Owen AB, Wang J (2016). Bi-cross-validation for factor analysis. Statistical Science, 31(1): 119–139. https://doi.org/10.1214/15-STS539
Steven Eyobu O, Han DS (2018). Feature representation and data augmentation for human activity classification based on wearable IMU sensor data using a deep LSTM neural network. Sensors, 18(9): 2892. https://doi.org/10.3390/s18092892
Uddin MP, Mamun MA, Hossain MA (2021). PCA-based feature reduction for hyperspectral remote sensing image classification. IETE Technical Review, 38(4): 377–396. https://doi.org/10.1080/02564602.2020.1740615
Wan S, Xia Y, Qi L, Yang YH, Atiquzzaman M (2020). Automated colorization of a grayscale image with seed points propagation. IEEE Transactions on Multimedia, 22(7): 1756–1768. https://doi.org/10.1109/TMM.2020.2976573
Wang H (2012). Factor profiled sure independence screening. Biometrika, 99(1): 15–28. https://doi.org/10.1093/biomet/asr074
Wang P, Li Z, Wei Z, Wu T, Luo C, Jiang W, et al. (2024). Space-time-coding digital metasurface element design based on state recognition and mapping methods with CNN-LSTM-DNN. IEEE Transactions on Antennas and Propagation, 72(6): 4962–4975. https://doi.org/10.1109/TAP.2024.3349778
Wang Q, Gao Q, Gao X, Nie F (2017). Optimal mean two-dimensional principal component analysis with F-norm minimization. Pattern Recognition, 68: 286–294. https://doi.org/10.1016/j.patcog.2017.03.026
Yang J, Zhang D, Frangi AF, Jy Y (2004). Two-dimensional PCA: a new approach to appearance-based face representation and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(1): 131–137. https://doi.org/10.1109/TPAMI.2004.1261097
Yang W, Wang S, Hu J, Tao X, Li Y (2024). Feature extraction and learning approaches for cancellable biometrics: a survey. CAAI Transactions on Intelligence Technology, 9(1): 4–25. https://doi.org/10.1049/cit2.12283
Zabalza J, Ren J, Yang M, Zhang Y, Wang J, Marshall S, et al. (2014). Novel folded-PCA for improved feature extraction and data reduction with hyperspectral imaging and SAR in remote sensing. ISPRS Journal of Photogrammetry and Remote Sensing, 93: 112–122. https://doi.org/10.1016/j.isprsjprs.2014.04.006