Supplementary Material

JDS

Journal of Data Science

1683-86021680-743X

1680-743X

School of Statistics, Renmin University of China

JDS1167

10.6339/25-JDS1167

Data Science Reviews

A Statistician’s Selective Review of Neural Network Modeling: Algorithms and Applications

https://orcid.org/0000-0002-3153-2662

Zhang

Chunming

czhang3@wisc.edu1∗ Zhang

Zhengjun

21 Zhong

Xinrui

1 Li

Jialuo

1 Zhao

Zhihao

1 1University of Wisconsin-Madison, Department of Statistics, Madison, Wisconsin, U.S.A. 2School of Economics and Management, and MOE Social Science Laboratory of Digital Economic Forecasts and Policy Simulation, University of Chinese Academy of Sciences, AMSS Center for Forecasting Sciences, Chinese Academy of Sciences, Beijing, China

∗Corresponding author. Email: czhang3@wisc.edu.

2025

2012025

234676694

Supplementary Material

The MATLAB implementation, including a README file, is available at https://github.com/ChunmingZhangUW/Review-NNM_JDS. The supplementary file includes Appendix A for the proof of Proposition 1 and Appendix B for numerical illustrations of LSTM models in Section 5.2.

25102024812025

2025 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.

2025

Open access article under the CC BY license.

Deep neural networks have a wide range of applications in data science. This paper reviews neural network modeling algorithms and their applications in both supervised and unsupervised learning. Key examples include: (i) binary classification and (ii) nonparametric regression function estimation, both implemented with feedforward neural networks ( FNN); (iii) sequential data prediction using long short-term memory ( LSTM) networks; and (iv) image classification using convolutional neural networks ( CNN). All implementations are provided in MATLAB, making these methods accessible to statisticians and data scientists to support learning and practical application.

Keywords classification nonparametric regression prediction time series

C. Zhang’s work was partially supported by the U.S. National Science Foundation grants DMS-2013486 and DMS-1712418, as well as funding provided by the University of Wisconsin-Madison Office of the Vice Chancellor for Research and Graduate Education through the Wisconsin Alumni Research Foundation. Z. Zhang’s research was supported by NSFC 72442027.

References

Alzubaidi

, Zhang

, Humaidi

, et al. (2021). Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. Journal of Big Data, 8(1): 53. https://doi.org/10.1186/s40537-021-00444-8

Coleman

, Li

(1994). On the convergence of reflective Newton methods for large-scale nonlinear minimization subject to bounds. Mathematical Programming, 67(2): 189–224. https://doi.org/10.1007/BF01582221

Fan

(2018). Local Polynomial Modelling and Its Applications, Monographs on Statistics and Applied Probability 66. Routledge.

Farrell

, Liang

, Misra

(2021). Deep neural networks for estimation and inference. Econometrica, 89(1): 181–213. https://doi.org/10.3982/ECTA16901

Friedman

(1991). Multivariate adaptive regression splines. The Annals of Statistics, 19(1): 1–67.

Friedman

, Tibshirani

, Hastie

(2001). The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 1st ed. Springer Series in Statistics. Springer, New York.

Goodfellow

, Bengio

, Courville

(2016). Deep Learning. MIT Press. http://www.deeplearningbook.org.

Higham

, Higham

(2019). Deep learning: An introduction for applied mathematicians. SIAM Review, 61(4): 860–891. https://doi.org/10.1137/18M1165748

Hinton

(2007). Learning multiple layers of representation. Trends in Cognitive Sciences, 11(10): 428–434. https://doi.org/10.1016/j.tics.2007.09.004

Hinton

, Osindero

, Teh

Y-W

(2006). A fast learning algorithm for deep belief nets. Neural Computation, 18(7): 1527–1554. https://doi.org/10.1162/neco.2006.18.7.1527

Jordan

, Mitchell

(2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245): 255–260. https://doi.org/10.1126/science.aaa8415

Katthi

, Ganapathy

, Kothinti

, Slaney

(2020). Deep canonical correlation analysis for decoding the auditory brain. In: 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), 3505–3508.

Kingma

, Ba

(2015). Adam: A method for stochastic optimization. In: 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings (

Bengio,

LeCun, eds.). ArXiv, Ithaca, NY. https://hdl.handle.net/11245/1.505367.

Kramer

(1991). Nonlinear principal component analysis using autoassociative neural networks. AIChE Journal, 37(2): 233–243. https://doi.org/10.1002/aic.690370209

Magnus

, Neudecker

(2019). Matrix Differential Calculus with Applications in Statistics and Econometrics. John Wiley & Sons.

McCullagh

, Nelder

(1989). Generalized Linear Models, 2nd ed. Chapman and Hall/CRC, Boca Raton, FL.

Muir

(2024). Adam stochastic gradient descent optimization. https://github.com/DylanMuir/fmin_adam.

Ripley

(1996). Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge; New York.

Schmidt-Hieber

(2020). Nonparametric regression using deep neural networks with relu activation function. The Annals of Statistics, 48(4): 1875–1897.

Vogl

, Mangis

, Rigler

, Zink

, Alkon

(1988). Accelerating the convergence of the back-propagation method. Biological Cybernetics, 59: 257–263. https://doi.org/10.1007/BF00332914

Wahba

(1990). Spline Models for Observational Data. SIAM.

Zhang

, Zhu

, Shen

(2023). Robust estimation in regression and classification methods for large dimensional data. Machine Learning, 112(9): 3361–3411. https://doi.org/10.1007/s10994-023-06349-2

Zhang

, Lu

, Zhao

(2024). Deep network approximation: Beyond relu to diverse activation functions. Journal of Machine Learning Research, 25(35): 1–39.

Zhong

, Zhang

(2024). Nonlinear functional principal component analysis using neural networks. arXiv preprint: https://arxiv.org/abs/2306.14388.