An exponentiated Weibull-geometric distribution is defined and studied. A new count data regression model, based on the exponentiated Weibull-geometric distribution, is also defined. The regression model can be applied to fit an underdispersed or an over-dispersed count data. The exponentiated Weibull-geometric regression model is fitted to two numerical data sets. The new model provided a better fit than the fit from its competitors.
Families of distributions are commonly used to model insurance claims data that require flexible distributional forms in a satisfactory manner, but the specification problem to assess the goodness-of-fit of the hypothesized model can sometimes be a challenge due to the complexity of the likelihood function of the family of distributions involved. The previous work shows that these specification problems can be attacked by means of semi-parametric tests based on generalized method of moment (GMM) estimators. While the approach can be directly applied to both discrete and continuous families of distributions, the paper focuses on developing a testing strategy within a framework of discrete families of distributions. Both the local power analysis and the approximate slope method demonstrate the excellent performance of these tests. The finite-sample performance of the tests, based on both asymptotic and bootstrap critical values, are also discussed and are compared with established methods that require the complete specification of likelihood functions.
When releasing data to the public, a vital concern is the risk of exposing personal information of the individuals who have contributed to the data set. Many mechanisms have been proposed to protect individual privacy, though less attention has been dedicated to practically conducting valid inferences on the altered privacy-protected data sets. For frequency tables, the privacy-protection-oriented perturbations often lead to negative cell counts. Releasing such tables can undermine users’ confidence in the usefulness of such data sets. This paper focuses on releasing one-way frequency tables. We recommend an optimal mechanism that satisfies ϵ-differential privacy (DP) without suffering from having negative cell counts. The procedure is optimal in the sense that the expected utility is maximized under a given privacy constraint. Valid inference procedures for testing goodness-of-fit are also developed for the DP privacy-protected data. In particular, we propose a de-biased test statistic for the optimal procedure and derive its asymptotic distribution. In addition, we also introduce testing procedures for the commonly used Laplace and Gaussian mechanisms, which provide a good finite sample approximation for the null distributions. Moreover, the decaying rate requirements for the privacy regime are provided for the inference procedures to be valid. We further consider common users’ practices such as merging related or neighboring cells or integrating statistical information obtained across different data sources and derive valid testing procedures when these operations occur. Simulation studies show that our inference results hold well even when the sample size is relatively small. Comparisons with the current field standards, including the Laplace, the Gaussian (both with/without post-processing of replacing negative cell counts with zeros), and the Binomial-Beta McClure-Reiter mechanisms, are carried out. In the end, we apply our method to the National Center for Early Development and Learning’s (NCEDL) multi-state studies data to demonstrate its practical applicability.