Random Machines: A Bagged-Weighted Support Vector Model with Free Kernel Choice

Ara, Anderson; Maia, Mateus; Louzada, Francisco; Macêdo, Samuel

doi:10.6339/21-JDS1014

Journal of Data Science

Random Machines: A Bagged-Weighted Support Vector Model with Free Kernel Choice

Volume 19, Issue 3 (2021), pp. 409–428

Anderson Ara Mateus Maia Francisco Louzada All authors (4)

https://doi.org/10.6339/21-JDS1014

Pub. online: 1 June 2021 Type: Statistical Data Science

Received
9 December 2020

Accepted
28 April 2021

Published
1 June 2021

Abstract

Improvement of statistical learning models to increase efficiency in solving classification or regression problems is a goal pursued by the scientific community. Particularly, the support vector machine model has become one of the most successful algorithms for this task. Despite the strong predictive capacity from the support vector approach, its performance relies on the selection of hyperparameters of the model, such as the kernel function that will be used. The traditional procedures to decide which kernel function will be used are computationally expensive, in general, becoming infeasible for certain datasets. In this paper, we proposed a novel framework to deal with the kernel function selection called Random Machines. The results improved accuracy and reduced computational time, evaluated over simulation scenarios, and real-data benchmarking.

Supplementary material

Supplementary Material

The proposed model called Random Machines (RM) was also implemented in R language and it can be used through the rmachines package, available and documented at GitHub https://github.com/MateusMaiaDS/rmachines. To a overall description of how to reproduce the results from this article just access the README at https://mateusmaiads.github.io/rmachines/.

References

Ayat NE, Cheriet M, Suen CY (2005). Automatic model selection for the optimization of SVM kernels. Pattern Recognition, 38(10): 1733–1745.

Baldi P, Brunak S, Chauvin Y, Andersen CA, Nielsen H (2000). Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics, 16(5): 412–424.

Bergstra J, Bengio Y (2012). Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13: 281–305.

Bergstra J, Bardenet R, Bengio Y, Kégl B (2011). Algorithms for hyper-parameter optimization. In: Dietterich T, Becker S, Ghahramani Z (Eds.). Advances in neural information processing systems, 2546–2554. Curran Associates, Inc..

Boser BE, Guyon IM, Vapnik VN (1992). A training algorithm for optimal margin classifiers. In: Haussler D (Chair). Proceedings of the fifth annual workshop on Computational learning theory, 144–152. ACM.

Boughorbel S, Jarray F, El-Anbari M (2017). Optimal classifier for imbalanced data using matthews correlation coefficient metric. PLoS ONE, 12(6): e0177678.

Breiman L (1996). Bagging predictors. Machine Learning, 24(2): 123–140.

Breiman L (2001). Random forests. Machine Learning, 45(1): 5–32.

Breiman L (1998). Arcing classifier (with discussion and a rejoinder by the author). The Annals of Statistics, 26(3): 801–849.

Burdisso SG, Errecalde M, Montes-y Gómez M (2019). A text classification framework for simple and effective early depression detection over social media streams. Expert Systems with Applications, 133: 182–197.

Chapelle O, Vapnik V (2000). Model selection for support vector machines. In: Leen T, Dietterich T, Tresp V (Eds.). Advances in neural information processing systems, 230–236. MIT Press.

Cherkassky V, Ma Y (2004). Practical selection of SVM parameters and noise estimation for SVM regression. Neural Networks, 17(1): 113–126.

Claesen M, De Smet F, Suykens JA, De Moor B (2014). Ensemblesvm: a library for ensemble learning using support vector machines. Journal of Machine Learning Research, 15(1): 141–145.

Cortes C, Vapnik V (1995). Support-vector networks. Machine Learning, 20(3): 273–297.

Courant R, Hilbert D (1953). Methods of mathematical physics, vol. i. Interscience Publ. Inc., New York, 106.

Cueto-López N, García-Ordás MT, Dávila-Batista V, Moreno V, Aragonés N, Alaiz-Rodríguez R (2019). A comparative study on feature selection for a risk prediction model for colorectal cancer. Computer Methods and Programs in Biomedicine, 177: 219–229.

Dighe D, Patil S, Kokate S (2018). Detection of credit card fraud transactions using machine learning algorithms and neural networks: A comparative study. In: Landge SDP (Chair). 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), 1–6. IEEE.

Dua D, Graff C (2017). UCI Machine Learning Repository. http://archive.ics.uci.edu/ml. University of California, Irvine, School of Information and Computer Sciences.

Fletcher R (1987). Practical methods of optimization. John Wiley & Sons, New York, 80.

Freund Y, Schapire R, Abe N (1999). A short introduction to boosting. Journal of Japanese Society For Artificial Intelligence, 14(771–780): 1612.

Friedrichs F, Igel C (2005). Evolutionary tuning of multiple SVM parameters. Neurocomputing, 64: 107–117.

Gordon JJ, Towsey MW, Hogan JM, Mathews SA, Timms P (2005). Improved prediction of bacterial transcription start sites. Bioinformatics, 22(2): 142–148.

Ho TK (1998). The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8): 1–22.

Huang MW, Chen CW, Lin WC, Ke SW, Tsai CF (2017). SVM and SVM ensembles in breast cancer prediction. PLoS ONE, 12(1): e0161501.

Hussain M, Wajid SK, Elzaart A, Berbar M (2011). A comparison of SVM kernel functions for breast cancer detection. In: Banissi E, Sarfraz M (Eds.). 2011 Eighth International Conference Computer Graphics, Imaging and Visualization, 145–150. IEEE.

Jebara T (2004). Multi-task feature and kernel selection for SVMs. In: Brodley C (Ed.). Proceedings of the twenty-first international conference on Machine learning, 55. ACM.

Kim H, Howland P, Park H (2005). Dimension reduction in text classification with support vector machines. Journal of Machine Learning Research, 6: 37–53.

Kim HC, Pang S, Je HM, Kim D, Bang SY (2002). Support vector machine ensemble with bagging. In: Lee S, Verri A (Eds.). International Workshop on Support Vector Machines, 397–408. Springer.

Kirkpatrick S, Gelatt CD, Vecchi MP (1983). Optimization by simulated annealing. Science, 220(4598): 671–680.

Lei Z, Yang Y Wu Z (2006). Ensemble of support vector machine for text-independent speaker recognition. International Journal of Computer Science and Network Security, 6(5): 163–167.

Li Z, Yang T, Zhang L, Jin R (2016). Fast and accurate refined nyström-based kernel svm. In: Schuurmans D, Wellman M (Eds.). Thirtieth AAAI Conference on Artificial Intelligence. AAAI Press.

Lin HT, Li L (2008). Support vector machinery for infinite ensemble learning. Journal of Machine Learning Research, 9: 285–312.

Matthews BW (1975). Comparison of the predicted and observed secondary structure of t4 phage lysozyme. Biochimica et Biophysica Acta. Protein Structure, 405(2): 442–451.

Min JH, Lee YC (2005). Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters. Expert Systems with Applications, 28(4): 603–614.

Mokgonyane TB, Sefara TJ, Modipa TI, Mogale MM, Manamela MJ, Manamela PJ (2019). Automatic speaker recognition system based on machine learning algorithms. In: Markus ED (Ed.). 2019 Southern African Universities Power Engineering Conference/Robotics and Mechatronics/Pattern Recognition Association of South Africa (SAUPEC/RobMech/PRASA), 141–146. IEEE.

Pang S, Kim D, Bang SY (2003). Membership authentication in the dynamic group by face classification using svm ensemble. Pattern Recognition Letters, 24(1–3): 215–225.

Pham BT, Bui DT, Prakash I (2018). Bagging based support vector machines for spatial prediction of landslides. Environmental Earth Sciences, 77(4): 146.

Sato M, Morimoto K, Kajihara S, Tateishi R, Shiina S, Koike K, et al. (2019). Machine-learning approach for the development of a novel predictive model for the diagnosis of hepatocellular carcinoma. Scientific Reports, 9(1): 7704–7704.

Shah SAR, Issac B (2018). Performance comparison of intrusion detection systems and application of machine learning to snort system. Future Generations Computer Systems, 80: 157–170.

Smola AJ, Bartlett PJ, Schuurmans D, Schölkopf B, Jordan MI, et al. (2000). Advances in large margin classifiers. MIT press.

Smola AJ, Schölkopf B (2000). Sparse greedy matrix approximation for machine learning. In: Langley P (Ed.). Proceedings of the 17th International Conference on Machine Learning, 911–918.

Sun Y, Gilbert A, Tewari A (2018). But how does it work in theory? linear SVM with random features. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (Eds.). Advances in Neural Information Processing Systems, 3379–3388.

Thanh Noi P, Kappas M (2018). Comparison of random forest, k-nearest neighbor, and support vector machine classifiers for land cover classification using sentinel-2 imagery. Sensors, 18(1): 18.

Tong M, Liu KH, Xu C, Ju W (2013). An ensemble of svm classifiers based on gene pairs. Computers in Biology and Medicine, 43(6): 729–737.

Turney P (1995). Bias and the quantification of stability. Machine Learning, 20(1–2): 23–33.

Van Wezel M, Potharst R (2007). Improved customer choice predictions using ensemble methods. European Journal of Operational Research, 181(1): 436–452.

Wang H, Zheng B, Yoon SW, Ko HS (2018). A support vector machine-based ensemble algorithm for breast cancer diagnosis. European Journal of Operational Research, 267(2): 687–699.

Wang Sj, Mathew A, Chen Y, Xi Lf Ma L Lee J (2009). Empirical analysis of support vector machine ensemble classifiers. Expert Systems with Applications, 36(3): 6466–6476.

Williams CK, Seeger M (2001). Using the nyström method to speed up kernel machines. In: Dietterich T, Becker S, Ghahramani Z (Eds.). Advances in neural information processing systems, 682–688. MIT Press.

Wu CH, Tzeng GH, Lin RH (2009). A novel hybrid genetic algorithm for kernel function and parameter optimization in support vector regression. Expert Systems with Applications, 36(3): 4725–4735.

Zhou L, Lai KK, Yu L (2010). Least squares support vector machines ensemble models for credit scoring. Expert Systems with Applications, 37(1): 127–133.

This is a free to read article.

Keywords

bagging kernel functions support vector machines

Funding

M.M.’s work was supported by a Science Foundation Ireland Career Development Award grant 17/CDA/4695. The authors are grateful for the partial funding provided by the Brazilian agencies CNPq and CAPES.

Metrics

since February 2021

1931

Article info
views

757

PDF
downloads

RSS

Authors

Abstract

Supplementary material

References

Export citation

Copy and paste formatted citation

Download citation in file