<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">JDS</journal-id>
<journal-title-group><journal-title>Journal of Data Science</journal-title></journal-title-group>
<issn pub-type="epub">1683-8602</issn><issn pub-type="ppub">1680-743X</issn><issn-l>1680-743X</issn-l>
<publisher>
<publisher-name>School of Statistics, Renmin University of China</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">JDS1071</article-id>
<article-id pub-id-type="doi">10.6339/22-JDS1071</article-id>
<article-categories><subj-group subj-group-type="heading">
<subject>Statistical Data Science</subject></subj-group></article-categories>
<title-group>
<article-title>High-Dimensional Nonlinear Spatio-Temporal Filtering by Compressing Hierarchical Sparse Cholesky Factors</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Chakraborty</surname><given-names>Anirban</given-names></name><xref ref-type="aff" rid="j_jds1071_aff_001">1</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Katzfuss</surname><given-names>Matthias</given-names></name><email xlink:href="mailto:katzfuss@gmail.com">katzfuss@gmail.com</email><xref ref-type="aff" rid="j_jds1071_aff_001">1</xref><xref ref-type="corresp" rid="cor1">∗</xref>
</contrib>
<aff id="j_jds1071_aff_001"><label>1</label>Department of Statistics, <institution>Texas A&amp;M University</institution>, 3143 TAMU, College Station, TX 77843, <country>USA</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>∗</label>Corresponding author. Email: <ext-link ext-link-type="uri" xlink:href="mailto:katzfuss@gmail.com">katzfuss@gmail.com</ext-link>.</corresp>
</author-notes>
<pub-date pub-type="ppub"><year>2022</year></pub-date><pub-date pub-type="epub"><day>3</day><month>10</month><year>2022</year></pub-date><volume>20</volume><issue>4</issue><fpage>461</fpage><lpage>474</lpage><supplementary-material id="S1" content-type="archive" xlink:href="jds1071_s001.zip" mimetype="application" mime-subtype="x-zip-compressed">
<caption>
<title>Supplementary Material</title>
<p><monospace>R</monospace> code to reproduce our results and figures is available at <uri>https://github.com/katzfuss-group/CHVfilter</uri>.</p>
</caption>
</supplementary-material><history><date date-type="received"><day>1</day><month>8</month><year>2022</year></date><date date-type="accepted"><day>28</day><month>9</month><year>2022</year></date></history>
<permissions><copyright-statement>2022 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.</copyright-statement><copyright-year>2022</copyright-year>
<license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>Spatio-temporal filtering is a common and challenging task in many environmental applications, where the evolution is often nonlinear and the dimension of the spatial state may be very high. We propose a scalable filtering approach based on a hierarchical sparse Cholesky representation of the filtering covariance matrix. At each time point, we compress the sparse Cholesky factor into a dense matrix with a small number of columns. After applying the evolution to each of these columns, we decompress to obtain a hierarchical sparse Cholesky factor of the forecast covariance, which can then be updated based on newly available data. We illustrate the Cholesky evolution via an equivalent representation in terms of spatial basis functions. We also demonstrate the advantage of our method in numerical comparisons, including using a high-dimensional and nonlinear Lorenz model.</p>
</abstract>
<kwd-group>
<label>Keywords</label>
<kwd>basis functions</kwd>
<kwd>data assimilation</kwd>
<kwd>hierarchical Vecchia approximation</kwd>
<kwd>Lorenz model</kwd>
<kwd>unscented Kalman filter</kwd>
</kwd-group>
<funding-group><award-group><funding-source xlink:href="https://doi.org/10.13039/100000001">National Science Foundation</funding-source><award-id>DMS–1654083</award-id><award-id>DMS–1953005</award-id></award-group><award-group><funding-source xlink:href="https://doi.org/10.13039/100000104">National Aeronautics and Space Administration</funding-source><award-id>80NM0018F0527</award-id></award-group><funding-statement>MK was partially supported by National Science Foundation (NSF) Grants DMS–1654083 and DMS–1953005, and by the National Aeronautics and Space Administration (80NM0018F0527). </funding-statement></funding-group>
</article-meta>
</front>
<body/>
<back>
<ref-list id="j_jds1071_reflist_001">
<title>References</title>
<ref id="j_jds1071_ref_001">
<mixed-citation publication-type="journal"> <string-name><surname>Arasaratnam</surname> <given-names>I</given-names></string-name>, <string-name><surname>Haykin</surname> <given-names>S</given-names></string-name> (<year>2009</year>). <article-title>Cubature Kalman filters</article-title>. <source>IEEE Transactions on Automatic Control</source>, <volume>54</volume>(<issue>6</issue>): <fpage>1254</fpage>–<lpage>1269</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_002">
<mixed-citation publication-type="journal"> <string-name><surname>Castrillón-Candás</surname> <given-names>JE</given-names></string-name>, <string-name><surname>Genton</surname> <given-names>MG</given-names></string-name>, <string-name><surname>Yokota</surname> <given-names>R</given-names></string-name> (<year>2016</year>). <article-title>Multi-level restricted maximum likelihood covariance estimation and kriging for large non-gridded spatial datasets</article-title>. <source>Spatial Statistics</source>, <volume>18</volume>: <fpage>105</fpage>–<lpage>124</lpage>. <comment>Spatial Statistics Avignon: Emerging Patterns.</comment></mixed-citation>
</ref>
<ref id="j_jds1071_ref_003">
<mixed-citation publication-type="journal"> <string-name><surname>Castrillón-Candás</surname> <given-names>JE</given-names></string-name>, <string-name><surname>Li</surname> <given-names>J</given-names></string-name>, <string-name><surname>Eijkhout</surname> <given-names>V</given-names></string-name> (<year>2013</year>). <article-title>A discrete adapted hierarchical basis solver for radial basis function interpolation</article-title>. <source>BIT</source>, <volume>53</volume>(<issue>1</issue>): <fpage>57</fpage>–<lpage>86</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_004">
<mixed-citation publication-type="chapter"> <string-name><surname>Chandrasekar</surname> <given-names>J</given-names></string-name>, <string-name><surname>Kim</surname> <given-names>IS</given-names></string-name>, <string-name><surname>Bernstein</surname> <given-names>DS</given-names></string-name>, <string-name><surname>Ridley</surname> <given-names>AJ</given-names></string-name> (<year>2008</year>). <chapter-title>Reduced-rank unscented Kalman filtering using Cholesky-based decomposition</chapter-title>. In: <source>2008 American Control Conference</source>, <fpage>1274</fpage>–<lpage>1279</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_005">
<mixed-citation publication-type="journal"> <string-name><surname>Datta</surname> <given-names>A</given-names></string-name>, <string-name><surname>Banerjee</surname> <given-names>S</given-names></string-name>, <string-name><surname>Finley</surname> <given-names>AO</given-names></string-name>, <string-name><surname>Gelfand</surname> <given-names>AE</given-names></string-name> (<year>2016</year>). <article-title>Hierarchical nearest-neighbor Gaussian process models for large geostatistical datasets</article-title>. <source>Journal of the American Statistical Association</source>, <volume>111</volume>(<issue>514</issue>): <fpage>800</fpage>–<lpage>812</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_006">
<mixed-citation publication-type="other"> <string-name><surname>Fang</surname> <given-names>C</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>J</given-names></string-name>, <string-name><surname>Ye</surname> <given-names>S</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>J</given-names></string-name> (2020). The geometric unscented Kalman filter. arXiv preprint: <uri>https://arxiv.org/abs/2009.13079</uri>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_007">
<mixed-citation publication-type="journal"> <string-name><surname>Gneiting</surname> <given-names>T</given-names></string-name>, <string-name><surname>Katzfuss</surname> <given-names>M</given-names></string-name> (<year>2014</year>). <article-title>Probabilistic forecasting</article-title>. <source>Annual Review of Statistics and Its Application</source>, <volume>1</volume>(<issue>1</issue>): <fpage>125</fpage>–<lpage>151</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_008">
<mixed-citation publication-type="journal"> <string-name><surname>Gordon</surname> <given-names>N</given-names></string-name>, <string-name><surname>Salmond</surname> <given-names>D</given-names></string-name>, <string-name><surname>Smith</surname> <given-names>A</given-names></string-name> (<year>1993</year>). <article-title>Novel approach to nonlinear/non-Gaussian Bayesian state estimation</article-title>. <source>IEE Proceedings. Part F. Radar and Signal Processing</source>, <volume>140</volume>(<issue>2</issue>): <fpage>107</fpage>–<lpage>113</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_009">
<mixed-citation publication-type="book"> <string-name><surname>Grewal</surname> <given-names>MS</given-names></string-name>, <string-name><surname>Andrews</surname> <given-names>AP</given-names></string-name> (<year>1993</year>). <source>Kalman Filtering: Theory and Applications</source>. <publisher-name>Prentice Hall</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_010">
<mixed-citation publication-type="journal"> <string-name><surname>Guinness</surname> <given-names>J</given-names></string-name> (<year>2018</year>). <article-title>Permutation and grouping methods for sharpening Gaussian process approximations</article-title>. <source>Technometrics</source>, <volume>60</volume>(<issue>4</issue>): <fpage>415</fpage>–<lpage>429</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_011">
<mixed-citation publication-type="journal"> <string-name><surname>Jurek</surname> <given-names>M</given-names></string-name>, <string-name><surname>Katzfuss</surname> <given-names>M</given-names></string-name> (<year>2022</year>). <article-title>Hierarchical sparse Cholesky decomposition with applications to high-dimensional spatio-temporal filtering</article-title>. <source>Statistics and Computing</source>, <volume>32</volume>: <fpage>15</fpage>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_012">
<mixed-citation publication-type="journal"> <string-name><surname>Kalman</surname> <given-names>R</given-names></string-name> (<year>1960</year>). <article-title>A new approach to linear filtering and prediction problems</article-title>. <source>Journal of Basic Engineering</source>, <volume>82</volume>(<issue>1</issue>): <fpage>35</fpage>–<lpage>45</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_013">
<mixed-citation publication-type="other"> <string-name><surname>Kang</surname> <given-names>M</given-names></string-name>, <string-name><surname>Katzfuss</surname> <given-names>M</given-names></string-name> (2021). Correlation-based sparse inverse Cholesky factorization for fast Gaussian-process inference. arXiv preprint: <uri>https://arxiv.org/abs/2112.14591</uri>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_014">
<mixed-citation publication-type="journal"> <string-name><surname>Katzfuss</surname> <given-names>M</given-names></string-name> (<year>2017</year>). <article-title>A multi-resolution approximation for massive spatial datasets</article-title>. <source>Journal of the American Statistical Association</source>, <volume>112</volume>(<issue>517</issue>): <fpage>201</fpage>–<lpage>214</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_015">
<mixed-citation publication-type="journal"> <string-name><surname>Katzfuss</surname> <given-names>M</given-names></string-name>, <string-name><surname>Guinness</surname> <given-names>J</given-names></string-name> (<year>2021</year>). <article-title>A general framework for Vecchia approximations of Gaussian processes</article-title>. <source>Statistical Science</source>, <volume>36</volume>(<issue>1</issue>): <fpage>124</fpage>–<lpage>141</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_016">
<mixed-citation publication-type="journal"> <string-name><surname>Katzfuss</surname> <given-names>M</given-names></string-name>, <string-name><surname>Guinness</surname> <given-names>J</given-names></string-name>, <string-name><surname>Gong</surname> <given-names>W</given-names></string-name>, <string-name><surname>Zilber</surname> <given-names>D</given-names></string-name> (<year>2020</year>). <article-title>Vecchia approximations of Gaussian-process predictions</article-title>. <source>Journal of Agricultural, Biological, and Environmental Statistics</source>, <volume>25</volume>(<issue>3</issue>): <fpage>383</fpage>–<lpage>414</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_017">
<mixed-citation publication-type="chapter"> <string-name><surname>Khazraj</surname> <given-names>H</given-names></string-name>, <string-name><surname>Faria da Silva</surname> <given-names>F</given-names></string-name>, <string-name><surname>Bak</surname> <given-names>CL</given-names></string-name> (<year>2016</year>). <chapter-title>A performance comparison between extended Kalman Filter and unscented Kalman Filter in power system dynamic state estimation</chapter-title>. In: <source>2016 51st International Universities Power Engineering Conference (UPEC)</source>, <fpage>1</fpage>–<lpage>6</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_018">
<mixed-citation publication-type="journal"> <string-name><surname>Liu</surname> <given-names>JS</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>R</given-names></string-name> (<year>1998</year>). <article-title>Sequential Monte Carlo methods for dynamic systems</article-title>. <source>Journal of the American Statistical Association</source>, <volume>93</volume>(<issue>443</issue>): <fpage>1032</fpage>–<lpage>1044</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_019">
<mixed-citation publication-type="journal"> <string-name><surname>Lorenz</surname> <given-names>EN</given-names></string-name> (<year>2005</year>). <article-title>Designing chaotic models</article-title>. <source>Journal of the Atmospheric Sciences</source>, <volume>62</volume>(<issue>5</issue>): <fpage>1574</fpage>–<lpage>1587</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_020">
<mixed-citation publication-type="journal"> <string-name><surname>Meng</surname> <given-names>D</given-names></string-name>, <string-name><surname>Miao</surname> <given-names>L</given-names></string-name>, <string-name><surname>Shao</surname> <given-names>H</given-names></string-name>, <string-name><surname>Shen</surname> <given-names>J</given-names></string-name> (<year>2018</year>). <article-title>A seventh-degree cubature Kalman filter</article-title>. <source>Asian Journal of Control</source>, <volume>20</volume>(<issue>1</issue>): <fpage>250</fpage>–<lpage>262</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_021">
<mixed-citation publication-type="chapter"> <string-name><surname>Nychka</surname> <given-names>DW</given-names></string-name>, <string-name><surname>Anderson</surname> <given-names>JL</given-names></string-name> (<year>2010</year>). <chapter-title>Data assimilation</chapter-title>. In: <source>Handbook of Spatial Statistics</source> (<string-name><given-names>AE</given-names> <surname>Gelfand</surname></string-name>, <string-name><given-names>PJ</given-names> <surname>Diggle</surname></string-name>, <string-name><given-names>M</given-names> <surname>Fuentes</surname></string-name>, <string-name><given-names>P</given-names> <surname>Guttorp</surname></string-name>, eds.), <fpage>477</fpage>–<lpage>494</lpage>. <publisher-name>CRC Press</publisher-name>. <comment>Chapter 27</comment>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_022">
<mixed-citation publication-type="journal"> <string-name><surname>Ott</surname> <given-names>E</given-names></string-name>, <string-name><surname>Hunt</surname> <given-names>BR</given-names></string-name>, <string-name><surname>Szunyogh</surname> <given-names>I</given-names></string-name>, <string-name><surname>Zimin</surname> <given-names>AV</given-names></string-name>, <string-name><surname>Kostelich</surname> <given-names>EJ</given-names></string-name>, <string-name><surname>Corazza</surname> <given-names>M</given-names></string-name>, <etal>et al.</etal> (<year>2004</year>). <article-title>A local ensemble Kalman filter for atmospheric data assimilation</article-title>. <source>Tellus A</source>, <volume>56</volume>: <fpage>415</fpage>–<lpage>428</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_023">
<mixed-citation publication-type="journal"> <string-name><surname>Pitt</surname> <given-names>MK</given-names></string-name>, <string-name><surname>Shephard</surname> <given-names>N</given-names></string-name> (<year>1999</year>). <article-title>Filtering via simulation: Auxiliary particle filters</article-title>. <source>Journal of the American Statistical Association</source>, <volume>94</volume>(<issue>446</issue>): <fpage>590</fpage>–<lpage>599</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_024">
<mixed-citation publication-type="journal"> <string-name><surname>Schäfer</surname> <given-names>F</given-names></string-name>, <string-name><surname>Katzfuss</surname> <given-names>M</given-names></string-name>, <string-name><surname>Owhadi</surname> <given-names>H</given-names></string-name> (<year>2021</year>). <article-title>Sparse Cholesky factorization by Kullback-Leibler minimization</article-title>. <source>SIAM Journal on Scientific Computing</source>, <volume>43</volume>(<issue>3</issue>): <fpage>A2019</fpage>–<lpage>A2046</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_025">
<mixed-citation publication-type="book"> <string-name><surname>Shumway</surname> <given-names>RH</given-names></string-name>, <string-name><surname>Stoffer</surname> <given-names>DS</given-names></string-name> (<year>2000</year>). <source>Time Series Analysis and Its Applications</source>. <publisher-name>Springer</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_026">
<mixed-citation publication-type="chapter"> <string-name><surname>St-Pierre</surname> <given-names>M</given-names></string-name>, <string-name><surname>Gingras</surname> <given-names>D</given-names></string-name> (<year>2004</year>). <chapter-title>Comparison between the unscented Kalman filter and the extended Kalman filter for the position estimation module of an integrated navigation information system</chapter-title>. In: <source>IEEE Intelligent Vehicles Symposium, 2004</source>, <fpage>831</fpage>–<lpage>835</lpage>. <publisher-name>IEEE</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_027">
<mixed-citation publication-type="journal"> <string-name><surname>Stein</surname> <given-names>ML</given-names></string-name>, <string-name><surname>Chi</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Welty</surname> <given-names>L</given-names></string-name> (<year>2004</year>). <article-title>Approximating likelihoods for large spatial data sets</article-title>. <source>Journal of the Royal Statistical Society, Series B</source>, <volume>66</volume>(<issue>2</issue>): <fpage>275</fpage>–<lpage>296</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_028">
<mixed-citation publication-type="journal"> <string-name><surname>Vecchia</surname> <given-names>A</given-names></string-name> (<year>1988</year>). <article-title>Estimation and model identification for continuous spatial processes</article-title>. <source>Journal of the Royal Statistical Society, Series B</source>, <volume>50</volume>(<issue>2</issue>): <fpage>297</fpage>–<lpage>312</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_029">
<mixed-citation publication-type="chapter"> <string-name><surname>Wan</surname> <given-names>E</given-names></string-name>, <string-name><surname>Van Der Merwe</surname> <given-names>R</given-names></string-name> (<year>2000</year>). <chapter-title>The unscented Kalman filter for nonlinear estimation</chapter-title>. In: <source>Adaptive Systems for Signal Processing, Communications, and Control. Lake Louise, Canada</source>, <fpage>153</fpage>–<lpage>158</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_030">
<mixed-citation publication-type="journal"> <string-name><surname>Wang</surname> <given-names>S</given-names></string-name>, <string-name><surname>Feng</surname> <given-names>J</given-names></string-name>, <string-name><surname>Chi</surname> <given-names>KT</given-names></string-name> (<year>2013</year>). <article-title>Spherical simplex-radial cubature Kalman filter</article-title>. <source>IEEE Signal Processing Letters</source>, <volume>21</volume>(<issue>1</issue>): <fpage>43</fpage>–<lpage>46</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_031">
<mixed-citation publication-type="book"> <string-name><surname>West</surname> <given-names>M</given-names></string-name>, <string-name><surname>Harrison</surname> <given-names>J</given-names></string-name> (<year>1997</year>). <source>Bayesian Forecasting and Dynamic Models</source>. <series><italic>Springer Series in Statistics</italic></series>. <publisher-name>Springer-Verlag</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1071_ref_032">
<mixed-citation publication-type="journal"> <string-name><surname>Zilber</surname> <given-names>D</given-names></string-name>, <string-name><surname>Katzfuss</surname> <given-names>M</given-names></string-name> (<year>2021</year>). <article-title>Vecchia-Laplace approximations of generalized Gaussian processes for big non-Gaussian spatial data</article-title>. <source>Computational Statistics &amp; Data Analysis</source>, <volume>153</volume>: <fpage>107081</fpage>.</mixed-citation>
</ref>
</ref-list>
</back>
</article>
