An exponentiated Weibull-geometric distribution is defined and studied. A new count data regression model, based on the exponentiated Weibull-geometric distribution, is also defined. The regression model can be applied to fit an underdispersed or an over-dispersed count data. The exponentiated Weibull-geometric regression model is fitted to two numerical data sets. The new model provided a better fit than the fit from its competitors.
Abstract: Friedman’s test is a rank-based procedure that can be used to test for differences among t treatment distributions in a randomized complete block design. It is well-known that the test has reasonably good power under location-shift alternatives to the null hypothesis of no difference in the t treatment distributions. However the power of Friedman’s test when the alternative hypothesis consists of a non-location difference in treatment distributions can be poor. We develop the properties of an alternative rank-based test that has greater power than Friedman’s test in a variety of such circumstances. The test is based on the joint distribution of the t! possible permutations of the treatment ranks within a block (assuming no ties). We show when our proposed test will have greater power than Friedman’s test, and provide results from extensive numerical work comparing the power of the two tests under various configurations for the underlying treatment distributions.
Families of distributions are commonly used to model insurance claims data that require flexible distributional forms in a satisfactory manner, but the specification problem to assess the goodness-of-fit of the hypothesized model can sometimes be a challenge due to the complexity of the likelihood function of the family of distributions involved. The previous work shows that these specification problems can be attacked by means of semi-parametric tests based on generalized method of moment (GMM) estimators. While the approach can be directly applied to both discrete and continuous families of distributions, the paper focuses on developing a testing strategy within a framework of discrete families of distributions. Both the local power analysis and the approximate slope method demonstrate the excellent performance of these tests. The finite-sample performance of the tests, based on both asymptotic and bootstrap critical values, are also discussed and are compared with established methods that require the complete specification of likelihood functions.
The complexity of energy infrastructure at large institutions increasingly calls for data-driven monitoring of energy usage. This article presents a hybrid monitoring algorithm for detecting consumption surges using statistical hypothesis testing, leveraging the posterior distribution and its information about uncertainty to introduce randomness in the parameter estimates, while retaining the frequentist testing framework. This hybrid approach is designed to be asymptotically equivalent to the Neyman-Pearson test. We show via extensive simulation studies that the hybrid approach enjoys control over type-1 error rate even with finite sample sizes whereas the naive plug-in method tends to exceed the specified level, resulting in overpowered tests. The proposed method is applied to the natural gas usage data at the University of Connecticut.
Large-scale genomics studies provide researchers with access to extensive datasets with extensive detail and unprecedented scope that encompasses not only genes, but also more experimental functional units, including non-coding microRNAs (miRNAs). In order to analyze these high-fidelity data while remaining faithful to the underlying biology, statistical methods are necessary that can reflect the full range of understanding in contemporary molecular biology, while remaining flexible enough to analyze a wide range of data and complex phenomena. Leveraging multiple omics datasets, miRNA-gene targets as well as signaling pathway topology, we present an integrative linear model to analyze signaling pathways. Specifically, we use a mixed linear model to characterize tumor and healthy tissue, and execute statistical significance testing to identify pathway disturbances. In this paper, pan-cancer analysis is performed for a wide range of signaling pathways. We discuss specific findings from this analysis, as well as an interactive data visualization available for public consumption that contains the full range of our analytic findings.