Tuning Support Vector Machines and Boosted Trees Using Optimization Algorithms

Lundell, Jill F.

doi:10.6339/23-JDS1106

Journal of Data Science

Tuning Support Vector Machines and Boosted Trees Using Optimization Algorithms

Volume 22, Issue 4 (2024), pp. 575–590

Jill F. Lundell

https://doi.org/10.6339/23-JDS1106

Pub. online: 5 July 2023 Type: Computing In Data Science

Open Access

Received
17 March 2023

Accepted
29 May 2023

Published
5 July 2023

Abstract

Statistical learning methods have been growing in popularity in recent years. Many of these procedures have parameters that must be tuned for models to perform well. Research has been extensive in neural networks, but not for many other learning methods. We looked at the behavior of tuning parameters for support vector machines, gradient boosting machines, and adaboost in both a classification and regression setting. We used grid search to identify ranges of tuning parameters where good models can be found across many different datasets. We then explored different optimization algorithms to select a model across the tuning parameter space. Models selected by the optimization algorithm were compared to the best models obtained through grid search to select well performing algorithms. This information was used to create an R package, EZtune, that automatically tunes support vector machines and boosted trees.

Supplementary material

Supplementary Material

The following supplementary material are available: Appendixes A: Description of optimization algorithms B: Performance tables R-package for EZtune: R-package EZtune that can implement autotuning of SVMs, GBMs, and adaboost using the Hooke-Jeeves algorithm and genetic algorithm. The package also contains Lichen and Mullein datasets used in the examples in the article. The package is currently available on CRAN and updates are available at https://github.com/jillbo1000/EZtune (GNU zipped tar file). Code and data for creating grids and performing optimization tests: The code and data used to create the error and time response surfaces and the code for testing the optimization algorithms is available at https://github.com/jillbo1000/autotune.

References

Bates D, Mullen KM, Nash JC, Varadhan R (2022). minqa: Derivative-free optimization algorithms by quadratic approximation. R package version 1.2.5.

Birgin EG, Martínez JM, Raydan M (2000). Nonmonotone spectral projected gradient methods on convex sets. SIAM Journal on Optimization, 10(4): 1196–1211. https://doi.org/10.1137/S1052623497330963

Breiman L (2001). Random forests. Machine Learning, 45(1): 5–32. https://doi.org/10.1023/A:1010933404324

Byrd RH, Lu P, Nocedal J, Zhu C (1995). A limited memory algorithm for bound constrained optimization. SIAM Journal on Scientific Computing, 16(5): 1190–1208. https://doi.org/10.1137/0916069

Cortes C, Vapnik V (1995). Support-vector networks. Machine Learning, 20(3): 273–297.

Culp M, Johnson K, Michailidis G (2016). ada: The R package ada for stochastic boosting. R package version 2.0-5.

Dai YH, Yuan Y (2001). An efficient hybrid conjugate gradient method for unconstrained optimization. Annals of Operations Research, 103(1–4): 33–47. https://doi.org/10.1023/A:1012930416777

De Cock D (2011). Ames, Iowa: Alternative to the Boston housing data as an end of semester regression project. Journal of Statistics Education, 19: 3.

Freund Y, Schapire RE (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1): 119–139. https://doi.org/10.1006/jcss.1997.1504

Friedman JH (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5): 1189–1232. https://doi.org/10.1214/aos/1013203451

Goldberg D (1999). Genetic algorithms in search optimization and machine learning. Addison-Wesley Longman Publishing Company, Boston, MA, USA.

Greenwell B, Boehmke B, Cunningham J, Developers G (2022). gbm: Generalized Boosted Regression Models. R package version 2.1.8.1.

Hastie T, Tibshirani R, Friedman J (2009). The elements of statistical learning: Data mining, inference, and prediction. Springer, New York, NY, USA.

Hooke R, Jeeves TA (1961). “Direct Search” solution of numerical and statistical problems. Journal of the ACM, 8(2): 212–229. https://doi.org/10.1145/321062.321069

Kaggle (2019). Ames housing dataset. https://www.kaggle.com/datasets/prevek18/ames-housing-dataset. Accessed: 2019-02-13.

Kelley CT (1999). Iterative methods for optimization. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA.

Kuhn M, Johnson K (2018). AppliedPredictiveModeling: Functions and data sets for ‘Applied Predictive Modeling’. R package version 1.1-7.

Kuiper S, Sklar J (2013). Practicing statistics: Guided investigations for the second course. Pearson, Boston, MA, USA.

Lundell J (2023). Eztune: A package for automated hyperparameter tuning in R. arXiv preprint arXiv:2303.12177.

Lundell JF (2017). There has to be an easier way: A simple alternative for parameter tuning of supervised learning methods. In: JSM Proceedings, Statistical Computing Section, 3028–3036. American Statistical Association, Alexandria, VA.

Lundell JF (2019). Tuning hyperparameters in supervised learning models and applications of statistical learning in genome-wide association studies with emphasis on heritability, Ph.D. thesis, Utah State University.

Mahdavi M, Fesanghary M, Damangir E (2007). An improved harmony search algorithm for solving optimization problems. Applied Mathematics and Computation, 188(2): 1567–1579. https://doi.org/10.1016/j.amc.2006.11.033

Meyer D, Dimitriadou E, Hornik K, Weingessel A, Leisch F (2022). e1071: Misc functions of the department of statistics. probability theory group (Formerly: E1071), TU Wien. R package version 1.7-12.

Mirjalili S (2015a). The ant lion optimizer. Advances in Engineering Software, 83: 80–98. https://doi.org/10.1016/j.advengsoft.2015.01.010

Mirjalili S (2015b). Moth-flame optimization algorithm: A novel nature-inspired heuristic paradigm. Knowledge-Based Systems, 89: 228–249. https://doi.org/10.1016/j.knosys.2015.07.006

Mirjalili S (2016a). Dragonfly algorithm: A new meta-heuristic optimization technique for solving single-objective, discrete, and multi-objective problems. Neural Computing & Applications, 27(4): 1053–1073. https://doi.org/10.1007/s00521-015-1920-1

Mirjalili S (2016b). SCA: A sine cosine algorithm for solving optimization problems. Knowledge-Based Systems, 96: 120–133. https://doi.org/10.1016/j.knosys.2015.12.022

Mirjalili S, Lewis A (2016). The whale optimization algorithm. Advances in Engineering Software, 95: 51–67. https://doi.org/10.1016/j.advengsoft.2016.01.008

Mirjalili S, Mirjalili SM, Lewis A (2014). Grey wolf optimizer. Advances in Engineering Software, 69: 46–61. https://doi.org/10.1016/j.advengsoft.2013.12.007

Nash JC (2014a). On best practice optimization methods in R. Journal of Statistical Software, 60(2): 1–14. https://doi.org/10.18637/jss.v060.i02

Nash JC (2014b). Rcgmin: Conjugate gradient minimization of nonlinear functions. R package version 2013-2.21.

Nash JC, Zhu C, Byrd R, Nocedal J, Morales JL (2020). lbfgsb3: Limited memory BFGS minimizer with bounds on parameters. R package version 2020-3.2.

Newman D, Hettich S, Blake C, Merz C (1998). UCI repository of machine learning databases. http://www.ics.uci.edu/~mlearn/MLRepository.html

Powell MJD (2009). The BOBYQA algorithm for bound constrained optimization without derivatives. Cambridge NA Report NA2009/06, University of Cambridge, Cambridge, 26–46.

R Core Team (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.

Saremi S, Mirjalili S, Lewis A (2017). Grasshopper optimisation algorithm: Theory and application. Advances in Engineering Software, 105: 30–47. https://doi.org/10.1016/j.advengsoft.2017.01.004

Schumacher C, Vose MD, Whitley LD (2001). The no free lunch and problem description length. In: Spector L, Goodman ED, Wu A, Langdon WB, Voight HM (eds.), Proceedings of the 3rd Annual Conference on Genetic and Evolutionary Computation, 565–570. Morgan Kaufmann Publishers Inc.

Scrucca L (2013). GA: A package for genetic algorithms in R. Journal of Statistical Software, 53(4): 1–37. https://doi.org/10.18637/jss.v053.i04

Septem Riza L, Iip, Prasetyo Nugroho E (2017). metaheuristicOpt: Metaheuristic for optimization. R package version 1.0.0.

Shi Y, Eberhart R (1998). A modified particle swarm optimizer. In: 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No. 98TH8360), 69–73. IEEE.

Smola AJ, Schölkopf B (2004). A tutorial on support vector regression. Statistics and Computing, 14(3): 199–222. https://doi.org/10.1023/B:STCO.0000035301.49549.88

Varadhan R, Gilbert P (2009). BB: An R package for solving a large system of nonlinear equations and for optimizing a high-dimensional nonlinear objective function. Journal of Statistical Software, 32(4): 1–26. https://doi.org/10.18637/jss.v032.i04

Varadhan R, Hopkins University J, Borchers HW (2020). dfoptim: Derivative-free optimization. In: ABB Corporate Research. R package version 2020.10-1.

Yang XS (2009). Firefly algorithms for multimodal optimization. In: Watanabe O, Zeugmann T (eds.), International Symposium on Stochastic Algorithms, 169–178. Springer.

2024 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.

Open access article under the CC BY license.

Keywords

machine learning optimization R programming

Metrics

since February 2021

456

Article info
views

395

PDF
downloads

RSS

Authors

Abstract

Supplementary material

References

Export citation

Copy and paste formatted citation

Download citation in file