Abstract: Despite the unreasonable feature independence assumption, the naive Bayes classifier provides a simple way but competes well with more sophisticated classifiers under zero-one loss function for assigning an observation to a class given the features observed. However, it has been proved that the naive Bayes works poorly in estimation and in classification for some cases when the features are correlated. To extend, researchers had developed many approaches to free of this primary but rarely satisfied assumption in the real world for the naive Bayes. In this paper, we propose a new classifier which is also free of the independence assumption by evaluating the dependence of features through pair copulas constructed via a graphical model called D-Vine tree. This tree structure helps to decompose the multivariate dependence into many bivariate dependencies and thus makes it possible to easily and efficiently evaluate the dependence of features even for data with high dimension and large sample size. We further extend the proposed method for features with discrete-valued entries. Experimental studies show that the proposed method performs well for both continuous and discrete cases.
An exponentiated Weibull-geometric distribution is defined and studied. A new count data regression model, based on the exponentiated Weibull-geometric distribution, is also defined. The regression model can be applied to fit an underdispersed or an over-dispersed count data. The exponentiated Weibull-geometric regression model is fitted to two numerical data sets. The new model provided a better fit than the fit from its competitors.
Abstract: In this study, we compared various block bootstrap methods in terms of parameter estimation, biases and mean squared errors (MSE) of the bootstrap estimators. Comparison is based on four real-world examples and an extensive simulation study with various sample sizes, parameters and block lengths. Our results reveal that ordered and sufficient ordered non-overlapping block bootstrap methods proposed by Beyaztas et al. (2016) provide better results in terms of parameter estimation and its MSE compared to conventional methods. Also, sufficient non-overlapping block bootstrap method and its ordered version have the smallest MSE for the sample mean among the others.
The Lindley distribution has been generalized by many authors in recent years. However, all of the known generalizations so far have restricted tail behaviors. Here, we introduce the most flexible generalization of the Lindley distribution with its tails controlled by two independent parameters. Various mathematical properties of the generalization are derived. Maximum likelihood estimators of its parameters are derived. Fisher’s information matrix and asymptotic confidence intervals for the parameters are given. Finally, a real data application shows that the proposed generalization performs better than all known ones
In this work, we introduce a new distribution for modeling the extreme values. Some important mathematical properties of the new model are derived. We assess the performance of the maximum likelihood method in terms of biases and mean squared errors by means of a simulation study. The new model is better than some other important competitive models in modeling the repair times data and the breaking stress data.