<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">JDS</journal-id>
<journal-title-group><journal-title>Journal of Data Science</journal-title></journal-title-group>
<issn pub-type="epub">1683-8602</issn><issn pub-type="ppub">1680-743X</issn><issn-l>1680-743X</issn-l>
<publisher>
<publisher-name>School of Statistics, Renmin University of China</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">JDS1039</article-id>
<article-id pub-id-type="doi">10.6339/22-JDS1039</article-id>
<article-categories><subj-group subj-group-type="heading">
<subject>Data Science in Action</subject></subj-group></article-categories>
<title-group>
<article-title>A Hybrid Monitoring Procedure for Detecting Abnormality with Application to Energy Consumption Data</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Lim</surname><given-names>Daeyoung</given-names></name><xref ref-type="aff" rid="j_jds1039_aff_001">1</xref>
</contrib>
<contrib contrib-type="author">
<contrib-id contrib-id-type="orcid">https://orcid.org/0000-0003-1935-2447</contrib-id>
<name><surname>Chen</surname><given-names>Ming-Hui</given-names></name><email xlink:href="mailto:ming-hui.chen@uconn.edu">ming-hui.chen@uconn.edu</email><xref ref-type="aff" rid="j_jds1039_aff_001">1</xref><xref ref-type="corresp" rid="cor1">∗</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Ravishanker</surname><given-names>Nalini</given-names></name><xref ref-type="aff" rid="j_jds1039_aff_001">1</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Bolduc</surname><given-names>Mark</given-names></name><xref ref-type="aff" rid="j_jds1039_aff_002">2</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>McKeon</surname><given-names>Brian</given-names></name><xref ref-type="aff" rid="j_jds1039_aff_002">2</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Nolan</surname><given-names>Stanley</given-names></name><xref ref-type="aff" rid="j_jds1039_aff_002">2</xref>
</contrib>
<aff id="j_jds1039_aff_001"><label>1</label>Department of Statistics, <institution>University of Connecticut</institution>, 215 Glenbrook Rd. U-4120, Storrs, CT 06269-4120, <country>United States</country></aff>
<aff id="j_jds1039_aff_002"><label>2</label>Facilities Operations, <institution>University of Connecticut</institution>, 25 LeDoyt Road U-3252, Storrs, CT 06269-3252, <country>United States</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>∗</label>Corresponding author. Email: <ext-link ext-link-type="uri" xlink:href="mailto:ming-hui.chen@uconn.edu">ming-hui.chen@uconn.edu</ext-link>.</corresp>
</author-notes>
<pub-date pub-type="ppub"><year>2022</year></pub-date><pub-date pub-type="epub"><day>16</day><month>3</month><year>2022</year></pub-date><volume>20</volume><issue>2</issue><fpage>135</fpage><lpage>155</lpage><supplementary-material id="S1" content-type="archive" xlink:href="jds1039_s001.zip" mimetype="application" mime-subtype="x-zip-compressed">
<caption>
<title>Supplementary Material</title>
<p>An R package for our method can be found at <uri>https://github.com/daeyounglim/energystuff</uri>. This repository contains R functions running our proposed method, an R program for generating simulation data sets, and another R wrapper function simplifying user interface for when running simulations over a large number of data sets.</p>
</caption>
</supplementary-material><history><date date-type="received"><day>11</day><month>12</month><year>2021</year></date><date date-type="accepted"><day>13</day><month>2</month><year>2022</year></date></history>
<permissions><copyright-statement>2022 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.</copyright-statement><copyright-year>2022</copyright-year>
<license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>The complexity of energy infrastructure at large institutions increasingly calls for data-driven monitoring of energy usage. This article presents a hybrid monitoring algorithm for detecting consumption surges using statistical hypothesis testing, leveraging the posterior distribution and its information about uncertainty to introduce randomness in the parameter estimates, while retaining the frequentist testing framework. This hybrid approach is designed to be asymptotically equivalent to the Neyman-Pearson test. We show via extensive simulation studies that the hybrid approach enjoys control over type-1 error rate even with finite sample sizes whereas the naive plug-in method tends to exceed the specified level, resulting in overpowered tests. The proposed method is applied to the natural gas usage data at the University of Connecticut.</p>
</abstract>
<kwd-group>
<label>Keywords</label>
<kwd>Bayesian</kwd>
<kwd>computationally-intensive method</kwd>
<kwd>frequentist</kwd>
<kwd>hypothesis testing</kwd>
</kwd-group>
</article-meta>
</front>
<back>
<ref-list id="j_jds1039_reflist_001">
<title>References</title>
<ref id="j_jds1039_ref_001">
<mixed-citation publication-type="journal"> <string-name><surname>Benjamini</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Hochberg</surname> <given-names>Y</given-names></string-name> (<year>1995</year>). <article-title>Controlling the false discovery rate: A practical and powerful approach to multiple testing</article-title>. <source>Journal of the Royal Statistical Society: Series B (Methodological)</source>, <volume>57</volume>(<issue>1</issue>): <fpage>289</fpage>–<lpage>300</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_002">
<mixed-citation publication-type="book"> <string-name><surname>Capehart</surname> <given-names>BL</given-names></string-name>, <string-name><surname>Turner</surname> <given-names>WC</given-names></string-name>, <string-name><surname>Kennedy</surname> <given-names>WJ</given-names></string-name> (<year>2020</year>). <source>Guide to Energy Management</source>. <publisher-name>River Publishers</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_003">
<mixed-citation publication-type="book"> <string-name><surname>Casella</surname> <given-names>G</given-names></string-name>, <string-name><surname>Berger</surname> <given-names>RL</given-names></string-name> (<year>2002</year>). <source>Statistical Inference</source>. <publisher-name>Cengage Learning</publisher-name>, <edition>2</edition>nd edition.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_004">
<mixed-citation publication-type="book"> <string-name><surname>Cormen</surname> <given-names>TH</given-names></string-name>, <string-name><surname>Leiserson</surname> <given-names>CE</given-names></string-name>, <string-name><surname>Rivest</surname> <given-names>RL</given-names></string-name>, <string-name><surname>Stein</surname> <given-names>C</given-names></string-name> (<year>2022</year>). <source>Introduction to Algorithms</source>. <publisher-name>The MIT Press</publisher-name>, <edition>4</edition>th edition.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_005">
<mixed-citation publication-type="book"> <string-name><surname>Doty</surname> <given-names>S</given-names></string-name>, <string-name><surname>Turner</surname> <given-names>WC</given-names></string-name> (<year>2004</year>). <source>Energy Management Handbook</source>. <publisher-name>CRC Press</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_006">
<mixed-citation publication-type="journal"> <string-name><surname>Dunn</surname> <given-names>OJ</given-names></string-name> (<year>1961</year>). <article-title>Multiple comparisons among means</article-title>. <source>Journal of the American Statistical Association</source>, <volume>56</volume>(<issue>293</issue>): <fpage>52</fpage>–<lpage>64</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_007">
<mixed-citation publication-type="journal"> <string-name><surname>Eddelbuettel</surname> <given-names>D</given-names></string-name>, <string-name><surname>Balamuta</surname> <given-names>JJ</given-names></string-name> (<year>2018</year>). <article-title>Extending R with C++: A brief introduction to Rcpp</article-title>. <source>The American Statistician</source>, <volume>72</volume>(<issue>1</issue>): <fpage>28</fpage>–<lpage>36</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_008">
<mixed-citation publication-type="journal"> <string-name><surname>Eddelbuettel</surname> <given-names>D</given-names></string-name>, <string-name><surname>Sanderson</surname> <given-names>C</given-names></string-name> (<year>2014</year>). <article-title>RcppArmadillo: Accelerating R with high-performance C++ linear algebra</article-title>. <source>Computational Statistics and Data Analysis</source>, <volume>71</volume>: <fpage>1054</fpage>–<lpage>1063</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_009">
<mixed-citation publication-type="book"> <string-name><surname>Efron</surname> <given-names>B</given-names></string-name>, <string-name><surname>Tibshirani</surname> <given-names>RJ</given-names></string-name> (<year>1994</year>). <source>An Introduction to the Bootstrap</source>. <publisher-name>CRC Press</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_010">
<mixed-citation publication-type="journal"> <string-name><surname>Fu</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Jeske</surname> <given-names>DR</given-names></string-name> (<year>2014</year>). <article-title>SPC methods for nonstationary correlated count data with application to network surveillance</article-title>. <source>Applied Stochastic Models in Business and Industry</source>, <volume>30</volume>(<issue>6</issue>): <fpage>708</fpage>–<lpage>722</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_011">
<mixed-citation publication-type="journal"> <string-name><surname>Geisser</surname> <given-names>S</given-names></string-name>, <string-name><surname>Cornfield</surname> <given-names>J</given-names></string-name> (<year>1963</year>). <article-title>Posterior distributions for multivariate normal parameters</article-title>. <source>Journal of the Royal Statistical Society. Series B (Methodological)</source>, <volume>25</volume>(<issue>2</issue>): <fpage>368</fpage>–<lpage>376</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_012">
<mixed-citation publication-type="journal"> <string-name><surname>Gelman</surname> <given-names>A</given-names></string-name>, <string-name><surname>Meng</surname> <given-names>XL</given-names></string-name>, <string-name><surname>Stern</surname> <given-names>H</given-names></string-name> (<year>1996</year>). <article-title>Posterior predictive assessment of model fitness via realized discrepancies</article-title>. <source>Statistica Sinica</source>, <volume>6</volume>(<issue>4</issue>): <fpage>733</fpage>–<lpage>760</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_013">
<mixed-citation publication-type="journal"> <string-name><surname>Hjort</surname> <given-names>NL</given-names></string-name>, <string-name><surname>Dahl</surname> <given-names>FA</given-names></string-name>, <string-name><surname>Steinbakk</surname> <given-names>GH</given-names></string-name> (<year>2006</year>). <article-title>Post-processing posterior predictive p values</article-title>. <source>Journal of the American Statistical Association</source>, <volume>101</volume>(<issue>475</issue>): <fpage>1157</fpage>–<lpage>1174</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_014">
<mixed-citation publication-type="journal"> <string-name><surname>Hochberg</surname> <given-names>Y</given-names></string-name> (<year>1988</year>). <article-title>A sharper Bonferroni procedure for multiple tests of significance</article-title>. <source>Biometrika</source>, <volume>75</volume>(<issue>4</issue>): <fpage>800</fpage>–<lpage>802</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_015">
<mixed-citation publication-type="journal"> <string-name><surname>Holm</surname> <given-names>S</given-names></string-name> (<year>1979</year>). <article-title>A simple sequentially rejective multiple test procedure</article-title>. <source>Scandinavian Journal of Statistics</source>, <volume>6</volume>(<issue>2</issue>): <fpage>65</fpage>–<lpage>70</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_016">
<mixed-citation publication-type="journal"> <string-name><surname>Hommel</surname> <given-names>G</given-names></string-name> (<year>1988</year>). <article-title>A stagewise rejective multiple test procedure based on a modified Bonferroni test</article-title>. <source>Biometrika</source>, <volume>75</volume>(<issue>2</issue>): <fpage>383</fpage>–<lpage>386</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_017">
<mixed-citation publication-type="book"> <string-name><surname>Jeffreys</surname> <given-names>H</given-names></string-name> (<year>1998</year>). <source>The Theory of Probability</source>. <publisher-name>OUP Oxford</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_018">
<mixed-citation publication-type="journal"> <string-name><surname>Meng</surname> <given-names>XL</given-names></string-name> (<year>1994</year>). <article-title>Posterior predictive <italic>p</italic>-values</article-title>. <source>The Annals of Statistics</source>, <volume>22</volume>(<issue>3</issue>): <fpage>1142</fpage>–<lpage>1160</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_019">
<mixed-citation publication-type="journal"> <string-name><surname>Neyman</surname> <given-names>J</given-names></string-name>, <string-name><surname>Pearson</surname> <given-names>ES</given-names></string-name> (<year>1933</year>). <article-title>IX. On the problem of the most efficient tests of statistical hypotheses</article-title>. <source>Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character</source>, <volume>231</volume>(<issue>694–706</issue>): <fpage>289</fpage>–<lpage>337</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_020">
<mixed-citation publication-type="other"> <collab>OpenMP Architecture Review Board</collab> (2018). OpenMP application programming interface version 5.0.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_021">
<mixed-citation publication-type="book"> <collab>R Core Team</collab> (<year>2021</year>). <source>R: A Language and Environment for Statistical Computing</source>. <publisher-name>R Foundation for Statistical Computing</publisher-name>, <publisher-loc>Vienna, Austria</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_022">
<mixed-citation publication-type="journal"> <string-name><surname>Raftery</surname> <given-names>AE</given-names></string-name>, <string-name><surname>Akman</surname> <given-names>V</given-names></string-name> (<year>1986</year>). <article-title>Bayesian analysis of a Poisson process with a change-point</article-title>. <source>Biometrika</source>, <volume>73</volume>(<issue>1</issue>): <fpage>85</fpage>–<lpage>89</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_023">
<mixed-citation publication-type="book"> <string-name><surname>Rao</surname> <given-names>CR</given-names></string-name> (<year>1973</year>). <source>Linear Statistical Inference and Its Applications</source>. <publisher-name>John Wiley &amp; Sons</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_024">
<mixed-citation publication-type="chapter"> <string-name><surname>Rashid</surname> <given-names>H</given-names></string-name>, <string-name><surname>Singh</surname> <given-names>P</given-names></string-name> (<year>2018</year>). <chapter-title>Monitor: An abnormality detection approach in buildings energy consumption</chapter-title>. In: <source>2018 IEEE 4th International Conference on Collaboration and Internet Computing (CIC)</source>, <fpage>16</fpage>–<lpage>25</lpage>. <publisher-name>IEEE</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_025">
<mixed-citation publication-type="book"> <string-name><surname>Ravishanker</surname> <given-names>N</given-names></string-name>, <string-name><surname>Chi</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Dey</surname> <given-names>DK</given-names></string-name> (<year>2022</year>). <source>A First Course in Linear Model Theory</source>. <publisher-name>Chapman and Hall/CRC</publisher-name>, <edition>2</edition>nd edition.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_026">
<mixed-citation publication-type="journal"> <string-name><surname>Ross</surname> <given-names>GJ</given-names></string-name>, <string-name><surname>Tasoulis</surname> <given-names>DK</given-names></string-name>, <string-name><surname>Adams</surname> <given-names>NM</given-names></string-name> (<year>2011</year>). <article-title>Nonparametric monitoring of data streams for changes in location and scale</article-title>. <source>Technometrics</source>, <volume>53</volume>(<issue>4</issue>): <fpage>379</fpage>–<lpage>389</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_027">
<mixed-citation publication-type="journal"> <string-name><surname>Ross</surname> <given-names>GJ</given-names></string-name>, <string-name><surname>Tasoulis</surname> <given-names>DK</given-names></string-name>, <string-name><surname>Adams</surname> <given-names>NM</given-names></string-name> (<year>2013</year>). <article-title>Sequential monitoring of a Bernoulli sequence when the pre-change parameter is unknown</article-title>. <source>Computational Statistics</source>, <volume>28</volume>(<issue>2</issue>): <fpage>463</fpage>–<lpage>479</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_028">
<mixed-citation publication-type="journal"> <string-name><surname>Seem</surname> <given-names>JE</given-names></string-name> (<year>2007</year>). <article-title>Using intelligent data analysis to detect abnormal energy consumption in buildings</article-title>. <source>Energy and Buildings</source>, <volume>39</volume>(<issue>1</issue>): <fpage>52</fpage>–<lpage>58</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_029">
<mixed-citation publication-type="journal"> <string-name><surname>Šidák</surname> <given-names>Z</given-names></string-name> (<year>1967</year>). <article-title>Rectangular confidence regions for the means of multivariate normal distributions</article-title>. <source>Journal of the American Statistical Association</source>, <volume>62</volume>(<issue>318</issue>): <fpage>626</fpage>–<lpage>633</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_030">
<mixed-citation publication-type="journal"> <string-name><surname>Simes</surname> <given-names>RJ</given-names></string-name> (<year>1986</year>). <article-title>An improved Bonferroni procedure for multiple tests of significance</article-title>. <source>Biometrika</source>, <volume>73</volume>(<issue>3</issue>): <fpage>751</fpage>–<lpage>754</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_031">
<mixed-citation publication-type="journal"> <string-name><surname>Sun</surname> <given-names>D</given-names></string-name>, <string-name><surname>Berger</surname> <given-names>JO</given-names></string-name> (<year>2007</year>). <article-title>Objective Bayesian analysis for the multivariate normal model</article-title>. <source>Bayesian Statistics</source>, <volume>8</volume>: <fpage>525</fpage>–<lpage>562</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_032">
<mixed-citation publication-type="other"> <collab>University of Michigan</collab> (2011). Final Report: Assessing a Campus Energy Monitoring System. <uri>http://graham.umich.edu/media/files/campus-course-reports/CEMS%20Final%20Report.pdf</uri>. Accessed: 2021-10-25.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_033">
<mixed-citation publication-type="other"> <collab>Worcester Polytechnic Institute</collab> (2007). Monitoring Electricity Consumption on the WPI Campus. The Reduction of Carbon Emissions Through the Implementation of Energy Information Tracking Technology. <uri>https://web.wpi.edu/Pubs/E-project/Available/E-project-060107-130245/unrestricted/iqpfinaldraft.pdf</uri>. Accessed: 2021-10-25.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_034">
<mixed-citation publication-type="journal"> <string-name><surname>Wright</surname> <given-names>SP</given-names></string-name> (<year>1992</year>). <article-title>Adjusted p-values for simultaneous inference</article-title>. <source>Biometrics</source>, <volume>48</volume>(<issue>4</issue>): <fpage>1005</fpage>–<lpage>1013</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_035">
<mixed-citation publication-type="journal"> <string-name><surname>Zhang</surname> <given-names>J</given-names></string-name>, <string-name><surname>Paschalidis</surname> <given-names>IC</given-names></string-name> (<year>2018</year>). <article-title>Statistical anomaly detection via composite hypothesis testing for Markov models</article-title>. <source>IEEE Transactions on Signal Processing</source>, <volume>66</volume>(<issue>3</issue>): <fpage>589</fpage>–<lpage>602</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1039_ref_036">
<mixed-citation publication-type="journal"> <string-name><surname>Zhao</surname> <given-names>L</given-names></string-name> (<year>2014</year>). <article-title>A novel method for detecting abnormal energy data in building energy monitoring system</article-title>. <source>Journal of Energy</source>, <volume>2014</volume>: <fpage>231571</fpage>.</mixed-citation>
</ref>
</ref-list>
</back>
</article>
