In this paper, parameter estimation for the power Lomax distribution is studied with different methods as maximum likelihood, maximum product spacing, ordinary least squares, weighted least squares, Cramér–von Mises and Bayesian estimation by Markov chain Monte Carlo (MCMC). Robust estimation of the stress-strength model for the Power Lomax distribution is discussed. We propose that the method of maximum product of spacing for reliable estimation of stress-strength model as an alternative method to maximum likelihood and Bayesian estimation methods. A numerical study using real data and Monte Carlo Simulation is performed to compare between different methods.
Abstract: Constrained general linear models (CGLMs) have wide applications in practice. Similar to other data analysis, the identification of influential obser vations that may be potential outliers is an important step beyond in CGLMs. We develop local influence approach for detecting influential observations in CGLMs. The procedure makes use of the normal curvature and the direction achieving the maximum curvature to assess the local influences of minor perturbation of CGLMs. An illustrative example with a real data set is also reported.
Abstract: We group approaches to modeling correlated binary data accord ing to data recorded cross-sectionally as opposed to data recorded longi tudinally; according to models that are population-averaged as opposed to subject-specific; and according to data with time-dependent covariates as opposed to time-independent covariates. Standard logistic regression mod els are appropriate for cross-sectional data. However, for longitudinal data, methods such as generalized estimating equations (GEE) and generalized method of moments (GMM) are commonly used to fit population-averaged models, while random-effects models such as generalized linear mixed mod els (GLMM) are used to fit subject-specific models. Some of these methods account for time-dependence in covariates while others do not. This paper addressed these approaches with an illustration using a Medicare dataset as it relates to rehospitalization. In particular, we compared results from standard logistic models, GEE models, GMM models, and random-effects models by analyzing a binary outcome for four successive hospitalizations. We found that these procedures address differently the correlation among responses and the feedback from response to covariate. We found marginal GMM logistic regression models to be more appropriate when covariates are classified as time-dependent in comparison to GEE models. We also found conditional random-intercept models with time-dependent covariates decom posed into components to be more appropriate when time-dependent covari ates are present in comparison to ordinary random-effects models. We used the SAS procedures GLIMMIX, NLMIXED, IML, GENMOD, and LOGIS TIC to analyze the illustrative dataset, as well as unique programs written using the R language.
Abstract: A powerful methodology for exploring relationships among items, association rules analysis can be used to capture a set of rules from any given dataset. Little is known, however, that a single dataset can be represented by more than one set of rules, i.e., by equivalent models. In fact, most studies on the goodness of model can be misleading because they assume the model is unique. These are phenomenon that the literature has yet to explore. In our study, we demonstrate that equivalent models exist for any dataset and propose a method for converting any given model into its dominant model, recommended as the benchmark model. Further, we explain how the phenomenon of equivalent models affects decision tree analysis and statistical model selection. It is shown that the decision rules from decision tree analysis can always be simplified by reducing the decision rules to the dominant model. The simulated and real datasets are used for illustration.
Abstract: In recent years, many modifications of the Weibull distribution have been proposed. Some of these modifications have a large number of parameters and so their real benefits over simpler modifications are questionable. Here, we use two data sets with modified unimodal (unimodal followed by increasing) hazard function for comparing the exponentiated Weibull and generalized modified Weibull distributions. We find no evidence that the generalized modified Weibull distribution can provide a better fit than the exponentiated Weibull distribution for data sets exhibiting the modified unimodal hazard function.In a related issue, we consider Carrasco et al. (2008), a widely cited paper, proposing the generalized modified Weibull distribution, and illustrating two real data applications. We point out that some of the results in both real data applications in Carrasco et al. (2008) 1 are incorrect.
An exponentiated Weibull-geometric distribution is defined and studied. A new count data regression model, based on the exponentiated Weibull-geometric distribution, is also defined. The regression model can be applied to fit an underdispersed or an over-dispersed count data. The exponentiated Weibull-geometric regression model is fitted to two numerical data sets. The new model provided a better fit than the fit from its competitors.
Abstract: Bivariate data analysis plays a key role in several areas where the variables of interest are obtained in a paired form, leading to the con sideration of possible association measures between them. In most cases, it is common to use known statistics measures such as Pearson correlation, Kendall’s and Spearman’s coefficients. However, these statistics measures may not represent the real correlation or structure of dependence between the variables. Fisher and Switzer (1985) proposed a rank-based graphical tool, the so called chi-plot, which, in conjunction with its Monte Carlo based confidence interval can help detect the presence of association in a random sample from a continuous bivariate distribution. In this article we construct the asymptotic confidence interval for the chi-plot. Via a Monte Carlo simulation study we discovery the coverage probabilities of the asymptotic and the Monte Carlo based confidence intervals are similar. A immediate advantage of the asymptotic confidence interval over the Monte Carlo based one is that it is computationally less expensive providing choices of any confidence level. Moreover, it can be implemented straightforwardly in the existing statistical softwares. The chi-plot approach is illustrated in on the average intelligence and atheism rates across nations data.
This study aims to compare various quantitative models to forecast monthly foreign tourist arrivals (FTAs) to India. The models which are considered here include vector error correction (VEC) model, Naive I and Naive II models, seasonal autoregressive integrated moving average (SARIMA) model and Grey models. A model based on combination of single forecast values using simple average (SA) method has also been applied. The forecasting performance of these models have been compared under mean absolute percentage error (MAPE) and U-statistic (Ustat) criteria. Empirical findings suggest that the combination model gives better forecast of FTAs to India relative to other individual time series models considered here.
Abstract: Simulation studies are important statistical tools used to inves-tigate the performance, properties and adequacy of statistical models. The simulation of right censored time-to-event data involves the generation of two independent survival distributions, where the rst distribution repre-sents the uncensored survival times and the second distribution represents the censoring mechanism. In this brief report we discuss how we can make it so that the percentage of censored data is previously de ned. The described method was used to generate data from a Weibull distribution, but it can be adapted to any other lifetime distribution. We further presented an R code function for generating random samples, considering the proposed approach.
Analysing seasonality in count time series is an essential application of statistics to predict phenomena in different fields like economics, agriculture, healthcare, environment, and climatic change. However, the information in the existing literature is scarce regarding the performances of relevant statistical models. This study provides the Yule-Walker (Y-W), Conditional Least Squares (CLS), and Maximum Likelihood Estimation (MLE) for First-order Non-negative Integer-valued Autoregressive, INAR(1), process with Poisson innovations with different monthly means. The performance of Y-W, CLS, and MLE are assessed by the Monte Carlo simulation method. The performance of this model is compared with another seasonal INAR(1) model by reproducing the monthly number of rainy days in the Blackwater River watershed located in coastal Virginia. Two forecast-coherent methods in terms of mode and probability function are applied to make predictions. The models’ performances are assessed using the Root Mean Square Error and Index of Agreement criteria. The results reveal the similar performance of Y-W, CLS, and MLE for estimating the parameters of data sets with larger sample size and values of α close to unite root. Moreover, the results indicate that INAR(1) with different monthly Poisson innovations is more appropriate for modelling and predicting seasonal count time series.