Journal of Data Science logo


Login Register

  1. Home
  2. Issues
  3. Volume 20, Issue 4 (2022): Special Issue: Large-Scale Spatial Data Science
  4. Vecchia Approximations and Optimization ...

Journal of Data Science

Submit your article Information
  • Article info
  • Related articles
  • More
    Article info Related articles

Vecchia Approximations and Optimization for Multivariate Matérn Models
Volume 20, Issue 4 (2022): Special Issue: Large-Scale Spatial Data Science, pp. 475–492
Youssef Fahmy   Joseph Guinness  

Authors

 
Placeholder
https://doi.org/10.6339/22-JDS1074
Pub. online: 14 October 2022      Type: Computing In Data Science      Open accessOpen Access

Received
1 August 2022
Accepted
11 October 2022
Published
14 October 2022

Abstract

We describe our implementation of the multivariate Matérn model for multivariate spatial datasets, using Vecchia’s approximation and a Fisher scoring optimization algorithm. We consider various pararameterizations for the multivariate Matérn that have been proposed in the literature for ensuring model validity, as well as an unconstrained model. A strength of our study is that the code is tested on many real-world multivariate spatial datasets. We use it to study the effect of ordering and conditioning in Vecchia’s approximation and the restrictions imposed by the various parameterizations. We also consider a model in which co-located nuggets are correlated across components and find that forcing this cross-component nugget correlation to be zero can have a serious impact on the other model parameters, so we suggest allowing cross-component correlation in co-located nugget terms.

Supplementary material

 Supplementary Material
The datasets and code used for this project can be found at https://github.com/yf297/GpGp_multi_paper.

References

 
Abdulah S, Almari F, Nag P, Sun Y, Ltaief H, Keyes DE, Genton MG (2022). The second competition on spatial statistics for large datasets. Journal of Data Science. In Press.
 
Abdulah S, Ltaief H, Sun Y, Genton MG, Keyes DE (2018). Exageostat: a high performance unified software for geostatistics on manycore systems. IEEE Transactions on Parallel and Distributed Systems, 29(12): 2771–2784.
 
Apanasovich TV, Genton MG, Sun Y (2012). A valid Matérn class of cross-covariance functions for multivariate random fields with any number of components. Journal of the American Statistical Association, 107(497): 180–193.
 
Bevilacqua M, Morales-Oñate V, Caamaño-Carrillo C (2018). GeoModels: Procedures for Gaussian and Non Gaussian Geostatistical (Large) Data Analysis. R package version 1.0.0.
 
Eckel FA, Mass CF (2005). Aspects of effective mesoscale, short-range ensemble forecasting. Weather and Forecasting, 20(3): 328–350.
 
Emery X, Porcu E, White P (2022). New validity conditions for the multivariate Matérn coregionalization model, with an application to exploration geochemistry. Mathematical Geosciences, 54(6): 1043–1068.
 
Finley A, Datta A, Banerjee S (2022). spNNGP: Spatial regression models for large datasets using nearest neighbor Gaussian processes. R package version 0.1.7.
 
Finley AO, Banerjee S, EGelfand A (2015). spBayes for large univariate and multivariate point-referenced spatio-temporal data models. Journal of Statistical Software, 63(13): 1–28.
 
Genton MG, Kleiber W (2015). Cross-covariance functions for multivariate geostatistics. Statistical Science, 30(2): 147–163.
 
Gneiting T, Kleiber W, Schlather M (2010). Matérn cross-covariance functions for multivariate random fields. Journal of the American Statistical Association, 105(491): 1167–1177.
 
Guinness J (2018). Permutation and grouping methods for sharpening Gaussian process approximations. Technometrics, 60(4): 415–429.
 
Guinness J (2021). Gaussian process learning via Fisher scoring of Vecchia’s approximation. Statistics and Computing, 31(3): 1–8.
 
Guinness J (2022). Nonparametric spectral methods for multivariate spatial and spatial–temporal data. Journal of Multivariate Analysis, 187: 104823.
 
Guinness J, Katzfuss M, Fahmy Y (2021). GpGp: Fast Gaussian process computation using Vecchia’s approximation. R package version 0.4. 0.
 
Huang H, Abdulah S, Sun Y, Ltaief H, Keyes DE, Genton MG (2021). Competition on spatial statistics for large datasets. Journal of Agricultural, Biological, and Environmental Statistics, 26(4): 580–595.
 
Katzfuss M, Jurek M, Zilber D, Gong W, Guinness J, Zhang J, et al. (2020). GPvecchia: Scalable Gaussian-process approximations. R package version 0.1.3.
 
Kinniburgh D, Smedley P (2001). Arsenic contamination of groundwater in Bangladesh, British Geological Survey Technical Report WC/00/19.
 
Kleiber W (2017). Coherence for multivariate random fields. Statistica Sinica, 27(4): 1675–1697.
 
Li B, Zhang H (2011). An approach to modeling asymmetric multivariate spatial covariance structures. Journal of Multivariate Analysis, 102(10): 1445–1453.
 
Lindgren F, Rue H, Lindström J (2011). An explicit link between Gaussian fields and Gaussian Markov random fields: The stochastic partial differential equation approach. Journal of the Royal Statistical Society, Series B, Statistical Methodology, 73(4): 423–498.
 
Nychka D, Furrer R, Paige J, Sain S (2021). fields: Tools for spatial data. R package version 14.0.
 
Nychka D, Hammerling D, Sain S, Lenssen N (2016). LatticeKrig: Multiresolution Kriging based on Markov random fields. R package version 8.4.
 
Pinheiro JC, Bates DM (1996). Unconstrained parameterizations for variance-covariance matrices. Statistics and Computing, 6: 289–296.
 
Qadir GA, Euán C, Sun Y (2021). Flexible modeling of variable asymmetries in cross-covariance functions for multivariate random fields. Journal of Agricultural, Biological, and Environmental Statistics, 26(1): 1–22.
 
Qadir GA, Sun Y (2021). Semiparametric estimation of cross-covariance functions for multivariate random fields. Biometrics, 77(2): 547–560.
 
Rue H, Martino S, Chopin N (2009). Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. Journal of the Royal Statistical Society, Series B, Statistical Methodology, 71(2): 319–392.
 
Saby N, Thioulouse J, Jolivet C, Ratié C, Boulonne L, Bispo A, et al. (2009). Multivariate analysis of the spatial patterns of 8 trace elements using the French soil monitoring network data. Science of the Total Environment, 407(21): 5644–5652.
 
Schlather M, Malinowski A, Oesting M (2022). RandomFields (archived on CRAN). R package version 3.3.14.
 
Vecchia AV (1988). Estimation and model identification for continuous spatial processes. Journal of the Royal Statistical Society, Series B, Methodological, 50(2): 297–312.
 
Zammit-Mangion A, Cressie N (2021). FRK: an R package for spatial and spatio-temporal prediction with large datasets. Journal of Statistical Software, 98(4): 1–48.

Related articles PDF XML
Related articles PDF XML

Copyright
2022 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.
by logo by logo
Open access article under the CC BY license.

Keywords
Gaussian process Fisher scoring software

Funding
This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1548562. This work is supported by the National Science Foundation Division of Mathematical Sciences under grant numbers 1916208 and 1953088.

Metrics
since February 2021
811

Article info
views

369

PDF
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

Journal of data science

  • Online ISSN: 1683-8602
  • Print ISSN: 1680-743X

About

  • About journal

For contributors

  • Submit
  • OA Policy
  • Become a Peer-reviewer

Contact us

  • JDS@ruc.edu.cn
  • No. 59 Zhongguancun Street, Haidian District Beijing, 100872, P.R. China
Powered by PubliMill  •  Privacy policy