Journal of Data Science logo


Login Register

  1. Home
  2. Issues
  3. Volume 20, Issue 2 (2022)
  4. A Hybrid Monitoring Procedure for Detect ...

Journal of Data Science

Submit your article Information
  • Article info
  • Related articles
  • More
    Article info Related articles

A Hybrid Monitoring Procedure for Detecting Abnormality with Application to Energy Consumption Data
Volume 20, Issue 2 (2022), pp. 135–155
Daeyoung Lim   Ming-Hui Chen ORCID icon link to view author Ming-Hui Chen details   Nalini Ravishanker     All authors (6)

Authors

 
Placeholder
https://doi.org/10.6339/22-JDS1039
Pub. online: 16 March 2022      Type: Data Science In Action      Open accessOpen Access

Received
11 December 2021
Accepted
13 February 2022
Published
16 March 2022

Abstract

The complexity of energy infrastructure at large institutions increasingly calls for data-driven monitoring of energy usage. This article presents a hybrid monitoring algorithm for detecting consumption surges using statistical hypothesis testing, leveraging the posterior distribution and its information about uncertainty to introduce randomness in the parameter estimates, while retaining the frequentist testing framework. This hybrid approach is designed to be asymptotically equivalent to the Neyman-Pearson test. We show via extensive simulation studies that the hybrid approach enjoys control over type-1 error rate even with finite sample sizes whereas the naive plug-in method tends to exceed the specified level, resulting in overpowered tests. The proposed method is applied to the natural gas usage data at the University of Connecticut.

Supplementary material

 Supplementary Material
An R package for our method can be found at https://github.com/daeyounglim/energystuff. This repository contains R functions running our proposed method, an R program for generating simulation data sets, and another R wrapper function simplifying user interface for when running simulations over a large number of data sets.

References

 
Benjamini Y, Hochberg Y (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B (Methodological), 57(1): 289–300.
 
Capehart BL, Turner WC, Kennedy WJ (2020). Guide to Energy Management. River Publishers.
 
Casella G, Berger RL (2002). Statistical Inference. Cengage Learning, 2nd edition.
 
Cormen TH, Leiserson CE, Rivest RL, Stein C (2022). Introduction to Algorithms. The MIT Press, 4th edition.
 
Doty S, Turner WC (2004). Energy Management Handbook. CRC Press.
 
Dunn OJ (1961). Multiple comparisons among means. Journal of the American Statistical Association, 56(293): 52–64.
 
Eddelbuettel D, Balamuta JJ (2018). Extending R with C++: A brief introduction to Rcpp. The American Statistician, 72(1): 28–36.
 
Eddelbuettel D, Sanderson C (2014). RcppArmadillo: Accelerating R with high-performance C++ linear algebra. Computational Statistics and Data Analysis, 71: 1054–1063.
 
Efron B, Tibshirani RJ (1994). An Introduction to the Bootstrap. CRC Press.
 
Fu Y, Jeske DR (2014). SPC methods for nonstationary correlated count data with application to network surveillance. Applied Stochastic Models in Business and Industry, 30(6): 708–722.
 
Geisser S, Cornfield J (1963). Posterior distributions for multivariate normal parameters. Journal of the Royal Statistical Society. Series B (Methodological), 25(2): 368–376.
 
Gelman A, Meng XL, Stern H (1996). Posterior predictive assessment of model fitness via realized discrepancies. Statistica Sinica, 6(4): 733–760.
 
Hjort NL, Dahl FA, Steinbakk GH (2006). Post-processing posterior predictive p values. Journal of the American Statistical Association, 101(475): 1157–1174.
 
Hochberg Y (1988). A sharper Bonferroni procedure for multiple tests of significance. Biometrika, 75(4): 800–802.
 
Holm S (1979). A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics, 6(2): 65–70.
 
Hommel G (1988). A stagewise rejective multiple test procedure based on a modified Bonferroni test. Biometrika, 75(2): 383–386.
 
Jeffreys H (1998). The Theory of Probability. OUP Oxford.
 
Meng XL (1994). Posterior predictive p-values. The Annals of Statistics, 22(3): 1142–1160.
 
Neyman J, Pearson ES (1933). IX. On the problem of the most efficient tests of statistical hypotheses. Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, 231(694–706): 289–337.
 
OpenMP Architecture Review Board (2018). OpenMP application programming interface version 5.0.
 
R Core Team (2021). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
 
Raftery AE, Akman V (1986). Bayesian analysis of a Poisson process with a change-point. Biometrika, 73(1): 85–89.
 
Rao CR (1973). Linear Statistical Inference and Its Applications. John Wiley & Sons.
 
Rashid H, Singh P (2018). Monitor: An abnormality detection approach in buildings energy consumption. In: 2018 IEEE 4th International Conference on Collaboration and Internet Computing (CIC), 16–25. IEEE.
 
Ravishanker N, Chi Z, Dey DK (2022). A First Course in Linear Model Theory. Chapman and Hall/CRC, 2nd edition.
 
Ross GJ, Tasoulis DK, Adams NM (2011). Nonparametric monitoring of data streams for changes in location and scale. Technometrics, 53(4): 379–389.
 
Ross GJ, Tasoulis DK, Adams NM (2013). Sequential monitoring of a Bernoulli sequence when the pre-change parameter is unknown. Computational Statistics, 28(2): 463–479.
 
Seem JE (2007). Using intelligent data analysis to detect abnormal energy consumption in buildings. Energy and Buildings, 39(1): 52–58.
 
Šidák Z (1967). Rectangular confidence regions for the means of multivariate normal distributions. Journal of the American Statistical Association, 62(318): 626–633.
 
Simes RJ (1986). An improved Bonferroni procedure for multiple tests of significance. Biometrika, 73(3): 751–754.
 
Sun D, Berger JO (2007). Objective Bayesian analysis for the multivariate normal model. Bayesian Statistics, 8: 525–562.
 
University of Michigan (2011). Final Report: Assessing a Campus Energy Monitoring System. http://graham.umich.edu/media/files/campus-course-reports/CEMS%20Final%20Report.pdf. Accessed: 2021-10-25.
 
Worcester Polytechnic Institute (2007). Monitoring Electricity Consumption on the WPI Campus. The Reduction of Carbon Emissions Through the Implementation of Energy Information Tracking Technology. https://web.wpi.edu/Pubs/E-project/Available/E-project-060107-130245/unrestricted/iqpfinaldraft.pdf. Accessed: 2021-10-25.
 
Wright SP (1992). Adjusted p-values for simultaneous inference. Biometrics, 48(4): 1005–1013.
 
Zhang J, Paschalidis IC (2018). Statistical anomaly detection via composite hypothesis testing for Markov models. IEEE Transactions on Signal Processing, 66(3): 589–602.
 
Zhao L (2014). A novel method for detecting abnormal energy data in building energy monitoring system. Journal of Energy, 2014: 231571.

Related articles PDF XML
Related articles PDF XML

Copyright
2022 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.
by logo by logo
Open access article under the CC BY license.

Keywords
Bayesian computationally-intensive method frequentist hypothesis testing

Metrics
since February 2021
894

Article info
views

450

PDF
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

Journal of data science

  • Online ISSN: 1683-8602
  • Print ISSN: 1680-743X

About

  • About journal

For contributors

  • Submit
  • OA Policy
  • Become a Peer-reviewer

Contact us

  • JDS@ruc.edu.cn
  • No. 59 Zhongguancun Street, Haidian District Beijing, 100872, P.R. China
Powered by PubliMill  •  Privacy policy