Scalable Predictions for Spatial Probit Linear Mixed Models Using Nearest Neighbor Gaussian Processes
Volume 20, Issue 4 (2022): Special Issue: Large-Scale Spatial Data Science, pp. 533–544
Pub. online: 3 November 2022
Type: Statistical Data Science
Open Access
Received
16 August 2022
16 August 2022
Accepted
6 October 2022
6 October 2022
Published
3 November 2022
3 November 2022
Abstract
Spatial probit generalized linear mixed models (spGLMM) with a linear fixed effect and a spatial random effect, endowed with a Gaussian Process prior, are widely used for analysis of binary spatial data. However, the canonical Bayesian implementation of this hierarchical mixed model can involve protracted Markov Chain Monte Carlo sampling. Alternate approaches have been proposed that circumvent this by directly representing the marginal likelihood from spGLMM in terms of multivariate normal cummulative distribution functions (cdf). We present a direct and fast rendition of this latter approach for predictions from a spatial probit linear mixed model. We show that the covariance matrix of the cdf characterizing the marginal cdf of binary spatial data from spGLMM is amenable to approximation using Nearest Neighbor Gaussian Processes (NNGP). This facilitates a scalable prediction algorithm for spGLMM using NNGP that only involves sparse or small matrix computations and can be deployed in an embarrassingly parallel manner. We demonstrate the accuracy and scalability of the algorithm via numerous simulation experiments and an analysis of species presence-absence data.
Supplementary material
Supplementary MaterialThis supplementary material contains discussion on why is it infeasible to directly use a Monte Carlo sampling to estimate
p
(
Y
) in (4), evaluation of the algorithms under consideration with respect to misclassification error, and details of the code and data used in the article.
References
Cao J, Durante D, Genton MG (2022). Scalable computation of predictive probabilities in probit models with gaussian process priors. Journal of Computational and Graphical Statistics, 1–12. https://doi.org/10.1080/10618600.2022.2036614.
Zhang Z, Arellano-Valle RB, Genton MG, Huser R (2022). Tractable bayes of skew-elliptical link models for correlated binary data. Biometrics. https://doi.org/10.1111/biom.13731.