On the Use of Deep Neural Networks for Large-Scale Spatial Prediction

Gray, Skyler D.; Heaton, Matthew J.; Bolintineanu, Dan S.; Olson, Aaron

doi:10.6339/22-JDS1070

Journal of Data Science

On the Use of Deep Neural Networks for Large-Scale Spatial Prediction

Volume 20, Issue 4 (2022): Special Issue: Large-Scale Spatial Data Science, pp. 493–511

Skyler D. Gray Matthew J. Heaton

Dan S. Bolintineanu All authors (4)

https://doi.org/10.6339/22-JDS1070

Pub. online: 3 October 2022 Type: Data Science In Action

Open Access

Received
28 July 2022

Accepted
27 September 2022

Published
3 October 2022

Abstract

For spatial kriging (prediction), the Gaussian process (GP) has been the go-to tool of spatial statisticians for decades. However, the GP is plagued by computational intractability, rendering it infeasible for use on large spatial data sets. Neural networks (NNs), on the other hand, have arisen as a flexible and computationally feasible approach for capturing nonlinear relationships. To date, however, NNs have only been scarcely used for problems in spatial statistics but their use is beginning to take root. In this work, we argue for equivalence between a NN and a GP and demonstrate how to implement NNs for kriging from large spatial data. We compare the computational efficacy and predictive power of NNs with that of GP approximations across a variety of big spatial Gaussian, non-Gaussian and binary data applications of up to size $n={10^{6}}$. Our results suggest that fully-connected NNs perform similarly to state-of-the-art, GP-approximated models for short-range predictions but can suffer for longer range predictions.

References

Allaire J, Chollet F (2022). keras: R Interface to ‘Keras’. R package version 2.9.0.

Banerjee S, Gelfand AE, Finley AO, Sang H (2008). Gaussian predictive process models for large spatial data sets. Journal of the Royal Statistical Society, Series B, Statistical Methodology, 70(4): 825–848.

Chen W, Li Y, Reich BJ, Sun Y (2022). Deepkriging: Spatially dependent deep neural networks for spatial prediction. Statistica Sinica. https://doi.org/10.5705/ss.202021.0277.

Cressie N, Johannesson G (2008a). Fixed rank Kriging for very large spatial data sets. Journal of the Royal Statistical Society, Series B, 70: 209–226.

Cressie N, Johannesson G (2008b). Fixed rank Kriging for very large spatial data sets. Journal of the Royal Statistical Society, Series B, Statistical Methodology, 70(1): 209–226.

Cressie N, Wikle CK (2015). Statistics for Spatio-Temporal Data. John Wiley & Sons.

Datta A, Banerjee S, Finley AO, Gelfand AE (2016a). Hierarchical nearest-neighbor Gaussian process models for large geostatistical datasets. Journal of the American Statistical Association, 111(514): 800–812.

Datta A, Banerjee S, Finley AO, Gelfand AE (2016b). On nearest-neighbor Gaussian process models for massive spatial data. Wiley Interdisciplinary Reviews: Computational Statistics, 8(5): 162–171.

Diggle PJ, Ribeiro PJ, Christensen OF (2003). An introduction to model-based geostatistics. In: Spatial Statistics and Computational Methods, 43–86. Springer.

Diggle PJ, Tawn JA, Moyeed RA (1998). Model-based geostatistics. Journal of the Royal Statistical Society. Series C. Applied Statistics, 47(3): 299–350.

El Bannany M, Khedr AM, Sreedharan M, Kanakkayil S (2021). Financial distress prediction based on multi-layer perceptron with parameter optimization. IAENG International Journal of Computer Science, 48: 3.

Furrer R, Genton MG, Nychka D (2006). Covariance tapering for interpolation of large spatial datasets. Journal of Computational and Graphical Statistics, 15(3): 502–523.

Gelfand AE, Schliep EM (2016). Spatial statistics and Gaussian processes: A beautiful marriage. Spatial Statistics, 18: 86–104. Spatial Statistics Avignon: Emerging Patterns.

Genton MG, Kleiber W (2015). Cross-covariance functions for multivariate geostatistics. Statistical Science, 30(2): 147–163.

Gerber F, Nychka D (2021). Fast covariance parameter estimation of spatial Gaussian process models using neural networks. Stat, 10(1): e382.

Heaton MJ, Datta A, Finley AO, Furrer R, Guinness J, Guhaniyogi R, et al. (2019). A case study competition among methods for analyzing large spatial data. Journal of Agricultural, Biological, and Environmental Statistics, 24(3): 398–425.

Higdon D (1998). A process-convolution approach to modelling temperatures in the North Atlantic Ocean. Environmental and Ecological Statistics, 5(2): 173–190.

Huang H, Abdulah S, Sun Y, Ltaief H, Keyes DE, Genton MG (2021a). Competition on spatial statistics for large datasets. Journal of Agricultural, Biological, and Environmental Statistics, 26(4): 580–595.

Huang H, Blake LR, Katzfuss M, Hammerling DM (2021b). Nonstationary spatial modeling of massive global satellite data. arXiv preprint: https://arxiv.org/abs/2111.13428.

Hughes J, Haran M (2013). Dimension reduction and alleviation of confounding for spatial generalized linear mixed models. Journal of the Royal Statistical Society, Series B, Statistical Methodology, 75(1): 139–159.

Jabbar H, Khan RZ (2015). Methods to avoid over-fitting and under-fitting in supervised machine learning (comparative study). Computer Science, Communication and Instrumentation Devices, 70: 163–172.

Katzfuss M (2017). A multi-resolution approximation for massive spatial datasets. Journal of the American Statistical Association, 112(517): 201–214.

Katzfuss M, Guinness J (2021). A general framework for Vecchia approximations of Gaussian processes. Statistical Science, 36(1): 124–141.

Kaufman CG, Schervish MJ, Nychka DW (2008). Covariance tapering for likelihood-based estimation in large spatial data sets. Journal of the American Statistical Association, 103(484): 1545–1555.

Lee J, Sohl-dickstein J, Pennington J, Novak R, Schoenholz S, Bahri Y (2018). Deep neural networks as Gaussian processes. In: International Conference on Learning Representations.

Lenzi A, Bessac J, Rudi J, Stein ML (2021). Neural networks for parameter estimation in intractable models. arXiv preprint: https://arxiv.org/abs/2107.14346.

Liu H, Ong YS, Shen X, Cai J (2020). When Gaussian process meets big data: A review of scalable gps. IEEE Transactions on Neural Networks and Learning Systems, 31(11): 4405–4423.

Matthews A, Rowland M, Hron J, Turner RE, Ghahramani Z (2018). Gaussian process behaviour in wide deep neural networks. arXiv preprint: https://arxiv.org/abs/1804.11271.

Mesa J, Vasquez DB, Aguirre JV, Valencia JSB (2019). Sensor fusion for distance estimation under disturbance with reflective optical sensors using multi layer perceptron (mlp). IEEE Latin America Transactions, 17(09): 1418–1423.

Molnar C, Freiesleben T, König G, Casalicchio G, Wright MN, Bischl B (2021). Relating the partial dependence plot and permutation feature importance to the data generating process. arXiv preprint: https://arxiv.org/abs/2109.01433.

Neal RM (1994). Priors for infinite networks (tech. rep. no. crg-tr-94-1). University of Toronto.

Nuanmeesri S, Sriurai W (2021). Multi-layer perceptron neural network model development for chili pepper disease diagnosis using filter and wrapper feature selection methods. Engineering, Technology & Applied Science Research, 11(5): 7714–7719.

Nwankpa C, Ijomah W, Gachagan A, Marshall S (2018). Activation functions: Comparison of trends in practice and research for deep learning. arXiv preprint: https://arxiv.org/abs/1811.03378.

Nychka D, Bandyopadhyay S, Hammerling D, Lindgren F, Sain S (2015). A multiresolution Gaussian process model for the analysis of large spatial datasets. Journal of Computational and Graphical Statistics, 24(2): 579–599.

R Core Team (2021). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.

Ramachandran P, Zoph B, Le QV (2017). Searching for activation functions. arXiv preprint: https://arxiv.org/abs/1710.05941.

Sang H, Huang JZ (2012). A full scale approximation of covariance functions for large spatial data sets. Journal of the Royal Statistical Society, Series B, Statistical Methodology, 74(1): 111–132.

Sauer A, Cooper A, Gramacy RB (2022). Vecchia-approximated deep Gaussian processes for computer experiments. arXiv preprint: https://arxiv.org/abs/2204.02904.

Sauer A, Gramacy RB, Higdon D (2022). Active learning for deep Gaussian process surrogates. Technometrics. https://doi.org/10.1080/00401706.2021.2008505.

Victoria AH, Maragatham G (2021). Automatic tuning of hyperparameters using Bayesian optimization. Evolving Systems, 12(1): 217–223.

Wikle CK, Zammit-Mangion A (2022). Statistical deep learning for spatial and spatio-temporal data. arXiv preprint: https://arxiv.org/abs/2206.02218.

Xu K, Zhang M, Li J, SS Kawarabayashi Ki D, Jegelka S (2020). How neural networks extrapolate: From feedforward to graph neural networks. arXiv preprint: https://arxiv.org/abs/2009.11848.

Yarotsky D (2018). Optimal approximation of continuous functions by very deep relu networks. In: Conference on Learning Theory (S Bubeck, V Perchet, P Rigollet, eds.), 639–649. PMLR.

Zammit-Mangion A, Ng TLJ, Vu Q, Filippone M (2021). Deep compositional spatial models. Journal of the American Statistical Association. https://doi.org/10.1080/01621459.2021.1887741.

Zammit-Mangion A, Wikle CK (2020). Deep integro-difference equation models for spatio-temporal forecasting. Spatial Statistics, 37: 100408.

Zhang P, Jia Y, Gao J, Song W, Leung H (2018). Short-term rainfall forecasting using multi-layer perceptron. IEEE Transactions on Big Data, 6(1): 93–106.

2022 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.

Open access article under the CC BY license.

Keywords

big data fully-connected neural network grid search

Funding

This research was supported by NASA grant 80NSSC20K1594 and by the Laboratory Directed Research and Development program at Sandia National Laboratories, a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia LLC, a wholly owned subsidiary of Honeywell International Inc. for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525.

Metrics

since February 2021

1424

Article info
views

679

PDF
downloads

RSS

Authors

Abstract

References

Export citation

Copy and paste formatted citation

Download citation in file