Journal of Data Science logo


Login Register

  1. Home
  2. To appear
  3. Analyzing the Rainfall Pattern in Hondur ...

Journal of Data Science

Submit your article Information
  • Article info
  • Related articles
  • More
    Article info Related articles

Analyzing the Rainfall Pattern in Honduras Through Non-Homogeneous Hidden Markov Models
Gustavo Alexis Sabillón   Daiane Aparecida Zuanetti  

Authors

 
Placeholder
https://doi.org/10.6339/23-JDS1091
Pub. online: 22 February 2023      Type: Data Science In Action      Open accessOpen Access

Received
31 August 2022
Accepted
19 February 2023
Published
22 February 2023

Abstract

One of the major climatic interests of the last decades has been to understand and describe the rainfall patterns of specific areas of the world as functions of other climate covariates. We do it for the historical climate monitoring data from Tegucigalpa, Honduras, using non-homogeneous hidden Markov models (NHMMs), which are dynamic models usually used to identify and predict heterogeneous regimes. For estimating the NHMM in an efficient and scalable way, we propose the stochastic Expectation-Maximization (EM) algorithm and a Bayesian method, and compare their performance in synthetic data. Although these methodologies have already been used for estimating several other statistical models, it is not the case of NHMMs which are still widely fitted by the traditional EM algorithm. We observe that, under tested conditions, the performance of the Bayesian and stochastic EM algorithms is similar and discuss their slight differences. Analyzing the Honduras rainfall data set, we identify three heterogeneous rainfall periods and select temperature and humidity as relevant covariates for explaining the dynamic relation among these periods.

Supplementary material

 Supplementary Material
Data that support the findings are openly available in GitHub at https://github.com/gsabillon85/NHMM-Estimation and https://gsabillon85.shinyapps.io/PrecipitationNHMM/. The R codes used for simulation are openly available in GitHub at https://github.com/gsabillon85/NHMM-Estimation.

References

 
Altman RM (2007). Mixed hidden Markov models: An extension of the hidden Markov model to the longitudinal data setting. Journal of the American Statistical Association, 102(477): 201–210. https://doi.org/10.1198/016214506000001086
 
Argeñal F (2010). Variabilidad climática y cambio climático en Honduras. In: Secretaria e Recursos Naturales y Ambiente and Programa de las Naciones Unidas para el Desarrollo PNUD.
 
Betrò B, Bodini A, Cossu QA (2008). Using a hidden Markov model to analyse extreme rainfall events in Central-East Sardinia. Environmetrics, 19(7): 702–713. https://doi.org/10.1002/env.944
 
Cavanaugh JE (1997). Unifying the derivations for the Akaike and corrected Akaike information criteria. Statistics & Probability Letters, 33(2): 201–208. https://doi.org/10.1016/S0167-7152(96)00128-9
 
Celeux G, Chauveau D, Diebolt J (1996). Stochastic versions of the EM algorithm: An experimental study in the mixture case. Journal of Statistical Computation and Simulation, 55(4): 287–314. https://doi.org/10.1080/00949659608811772
 
Celeux G, Diebolt J (1985). The SEM algorithm: A probabilistic teacher algorithm derived from the EM algorithm for the mixture problem. Computational Statistics Quarterly, 2: 73–82.
 
Celeux G, Diebolt J (1992). A stochastic approximation type EM algorithm for the mixture problem. Stochastics: An International Journal of Probability and Stochastic Processes, 41(1–2): 119–134.
 
Dempster AP, Laird NM, Rubin DB (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 39(1): 1–38.
 
Di Mari R, Oberski DL, Vermunt JK (2016). Bias-adjusted three-step latent Markov modeling with covariates. Structural Equation Modeling: A Multidisciplinary Journal, 23(5): 649–660. https://doi.org/10.1080/10705511.2016.1191015
 
Gao B, Pavel L (2017). On the properties of the Softmax function with application in game theory and reinforcement learning. arXiv preprint: https://arxiv.org/abs/1704.00805
 
Ghavidel FZ, Claesen J, Burzykowski T (2015). A non-homogeneous hidden Markov model for gene mapping based on next-generation sequencing data. Journal of Computational Biology, 22(2): 178–188. https://doi.org/10.1089/cmb.2014.0258
 
Green PJ (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika, 82(4): 711–732. https://doi.org/10.1093/biomet/82.4.711
 
Holsclaw T, Greene AM, Robertson AW, Smyth P, et al. (2017). Bayesian nonhomogeneous Markov models via Pólya-gamma data augmentation with applications to rainfall modeling. The Annals of Applied Statistics, 11(1): 393–426. https://doi.org/10.1214/16-AOAS1009
 
Lagona F, Maruotti A, Picone M (2011). A non-homogeneous hidden Markov model for the analysis of multi-pollutant exceedances data. In: Hidden Markov Models, Theory and Applications (Przemyslaw Dymarski, ed.), 207–222. Intech, Rijeka, Croatia.
 
MacDonald IL, Zucchini W (1997). Hidden Markov and Other Models for Discrete-Valued Time Series, volume 110. Chapman and Hall, London.
 
Malefaki S, Trevezas S, Limnios N (2010). An EM and a stochastic version of the EM algorithm for nonparametric hidden semi-Markov models. Communications in Statistics: Simulation and Computation, 39(2): 240–261. https://doi.org/10.1080/03610910903411185
 
Maruotti A, Bulla J, Lagona F, Picone M, Martella F (2017). Dynamic mixtures of factor analyzers to characterize multivariate air pollutant exposures. The Annals of Applied Statistics, 11(3): 1617–1648. https://doi.org/10.1214/17-AOAS1049
 
Maruotti A, Rocci R (2012). A mixed non-homogeneous hidden Markov model for categorical data, with application to alcohol consumption. Statistics in Medicine, 31(9): 871–886. https://doi.org/10.1002/sim.4478
 
Meligkotsidou L, Dellaportas P (2011). Forecasting with non-homogeneous hidden Markov models. Statistics and Computing, 21(3): 439–449. https://doi.org/10.1007/s11222-010-9180-5
 
Neal RM (2003). Slice sampling. Annals of Statistics, 31(3): 705–767. https://doi.org/10.1214/aos/1056562461
 
Neykov N, Neytchev P, Zucchini W, Hristov H (2012). Linking atmospheric circulation to daily precipitation patterns over the territory of Bulgaria. Environmental and Ecological Statistics, 19(2): 249–267. https://doi.org/10.1007/s10651-011-0185-9
 
Papastamoulis P (2014). Handling the label switching problem in latent class models via the ECR algorithm. Communications in Statistics: Simulation and Computation, 43(4): 913–927. https://doi.org/10.1080/03610918.2012.718840
 
Papastamoulis P (2016). label.switching: An R package for dealing with the label switching problem in MCMC outputs. Journal of Statistical Software, Code Snippets, 69(1): 1–24.
 
Pennoni F, Genge E (2020). Analysing the course of public trust via hidden Markov models: A focus on the Polish society. Statistical Methods & Applications, 29: 399–425. https://doi.org/10.1007/s10260-019-00483-9
 
Plummer M (2022). rjags: Bayesian Graphical Models Using MCMC. R package version 4-13.
 
Rabiner L, Juang B (1986). An introduction to hidden Markov models. IEEE ASSP Magazine, 3(1): 4–16. https://doi.org/10.1109/MASSP.1986.1165342
 
Rabiner LR (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2): 257–286. https://doi.org/10.1109/5.18626
 
Robertson AW, Kirshner S, Smyth P (2004). Downscaling of daily rainfall occurrence over northeast Brazil using a hidden Markov model. Journal of Climate, 17(22): 4407–4424. https://doi.org/10.1175/JCLI-3216.1
 
Rydén T (2008). EM versus Markov chain Monte Carlo for estimation of hidden Markov models: A computational perspective. Bayesian Analysis, 3(4): 659–688.
 
Schwarz G (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2): 461–464. https://doi.org/10.1214/aos/1176344136
 
Shen L, Zhu J, Robert Li SY Fan X (2017). Detect differentially methylated regions using non-homogeneous hidden Markov model for methylation array data. Bioinformatics, 33(23): 3701–3708. https://doi.org/10.1093/bioinformatics/btx467
 
Shirley KE, Small DS, Lynch KG, Maisto SA, Oslin DW (2010). Hidden Markov models for alcoholism treatment trial data. The Annals of Applied Statistics, 4(1): 366–395. https://doi.org/10.1214/09-AOAS282
 
Stephens M (2000). Dealing with label switching in mixture models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 62(4): 795–809. https://doi.org/10.1111/1467-9868.00265
 
Visser I, Speekenbrink M (2010). depmixS4: An R package for hidden Markov models. Journal of Statistical Software, 36(7): 1–21. https://doi.org/10.18637/jss.v036.i07
 
Zuanetti DA, Milan LA (2017). A generalized mixture model applied to diabetes incidence data. Biometrical Journal, 59(4): 826–842. https://doi.org/10.1002/bimj.201600086
 
Zucchini W, MacDonald IL (2009). Hidden Markov Models for Time Series: An Introduction Using R. Chapman and Hall, London.
 
Hidden Markov models for time series: An introduction using R

Related articles PDF XML
Related articles PDF XML

Copyright
2023 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.
by logo by logo
Open access article under the CC BY license.

Keywords
Bayesian approach dynamic models estimation and classification performance rainfall pattern description stochastic EM algorithm

Metrics (since February 2021)
9

Article info
views

0

Full article
views

21

PDF
downloads

16

XML
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

Journal of data science

  • Online ISSN: 1683-8602
  • Print ISSN: 1680-743X

About

  • About journal

For contributors

  • Submit
  • OA Policy
  • Become a Peer-reviewer

Contact us

  • JDS@ruc.edu.cn
  • No. 59 Zhongguancun Street, Haidian District Beijing, 100872, P.R. China
Powered by PubliMill  •  Privacy policy