On the Use of Deep Neural Networks for Large-Scale Spatial Prediction
Volume 20, Issue 4 (2022): Special Issue: Large-Scale Spatial Data Science, pp. 493–511
Pub. online: 3 October 2022
Type: Data Science In Action
Open Access
Received
28 July 2022
28 July 2022
Accepted
27 September 2022
27 September 2022
Published
3 October 2022
3 October 2022
Abstract
For spatial kriging (prediction), the Gaussian process (GP) has been the go-to tool of spatial statisticians for decades. However, the GP is plagued by computational intractability, rendering it infeasible for use on large spatial data sets. Neural networks (NNs), on the other hand, have arisen as a flexible and computationally feasible approach for capturing nonlinear relationships. To date, however, NNs have only been scarcely used for problems in spatial statistics but their use is beginning to take root. In this work, we argue for equivalence between a NN and a GP and demonstrate how to implement NNs for kriging from large spatial data. We compare the computational efficacy and predictive power of NNs with that of GP approximations across a variety of big spatial Gaussian, non-Gaussian and binary data applications of up to size $n={10^{6}}$. Our results suggest that fully-connected NNs perform similarly to state-of-the-art, GP-approximated models for short-range predictions but can suffer for longer range predictions.
References
Chen W, Li Y, Reich BJ, Sun Y (2022). Deepkriging: Spatially dependent deep neural networks for spatial prediction. Statistica Sinica. https://doi.org/10.5705/ss.202021.0277.
Huang H, Blake LR, Katzfuss M, Hammerling DM (2021b). Nonstationary spatial modeling of massive global satellite data. arXiv preprint: https://arxiv.org/abs/2111.13428.
Lenzi A, Bessac J, Rudi J, Stein ML (2021). Neural networks for parameter estimation in intractable models. arXiv preprint: https://arxiv.org/abs/2107.14346.
Matthews A, Rowland M, Hron J, Turner RE, Ghahramani Z (2018). Gaussian process behaviour in wide deep neural networks. arXiv preprint: https://arxiv.org/abs/1804.11271.
Molnar C, Freiesleben T, König G, Casalicchio G, Wright MN, Bischl B (2021). Relating the partial dependence plot and permutation feature importance to the data generating process. arXiv preprint: https://arxiv.org/abs/2109.01433.
Nwankpa C, Ijomah W, Gachagan A, Marshall S (2018). Activation functions: Comparison of trends in practice and research for deep learning. arXiv preprint: https://arxiv.org/abs/1811.03378.
Ramachandran P, Zoph B, Le QV (2017). Searching for activation functions. arXiv preprint: https://arxiv.org/abs/1710.05941.
Sauer A, Cooper A, Gramacy RB (2022). Vecchia-approximated deep Gaussian processes for computer experiments. arXiv preprint: https://arxiv.org/abs/2204.02904.
Sauer A, Gramacy RB, Higdon D (2022). Active learning for deep Gaussian process surrogates. Technometrics. https://doi.org/10.1080/00401706.2021.2008505.
Wikle CK, Zammit-Mangion A (2022). Statistical deep learning for spatial and spatio-temporal data. arXiv preprint: https://arxiv.org/abs/2206.02218.
Xu K, Zhang M, Li J, SS Kawarabayashi Ki D, Jegelka S (2020). How neural networks extrapolate: From feedforward to graph neural networks. arXiv preprint: https://arxiv.org/abs/2009.11848.
Zammit-Mangion A, Ng TLJ, Vu Q, Filippone M (2021). Deep compositional spatial models. Journal of the American Statistical Association. https://doi.org/10.1080/01621459.2021.1887741.