Abstract: For estimating bivariate survival function under random censor ship, it is commonly believed that the Dabrowska estimator is among the best ones while the Volterra estimator is far from being computational ef ficiency. As we will see, the Volterra estimator is a natural extension of the Kaplan-Meier estimator to bivariate data setting. We believe that the computational ‘inefficiency’ of the Volterra estimator is largely due to the formidable computational complexity of the traditional recursion method. In this paper, we show by numerical study as well as theoretical analysis that the Volterra estimator, once computed by dynamic programming technique, is more computationally efficient than the Dabrowska estimator. Therefore, the Volterra estimator with dynamic programming would be quite recom mendable in applications owing to its significant computational advantages.
The generalized exponentiated exponential Lindley distribution is a novel three parameter distribution due to Hussain et al. (2017). They studied its properties including estimation issues and illustrated applications to four datasets. Here, we show that several known distributions including those having two parameters can provide better fits. We also correct errors in the derivatives of the likelihood function.
Abstract: We have developed an enhanced spike and slab model for variable selection in linear regression models via restricted final prediction error (FPE) criteria; classic examples of which are AIC and BIC. Based on our proposed Bayesian hierarchical model, a Gibbs sampler is developed to sample models. The special structure of the prior enforces a unique mapping between sampling a model and calculating constrained ordinary least squares estimates for that model, which helps to formulate the restricted FPE criteria. Empirical comparisons are done to the lasso, adaptive lasso and relaxed lasso; followed by a real life data example.
Abstract: Missing data are a common problem for researchers working with surveys and other types of questionnaires. Often, respondents do not respond to one or more items, making the conduct of statistical analyses, as well as the calculation of scores difficult. A number of methods have been developed for dealing with missing data, though most of these have focused on continuous variables. It is not clear that these techniques for imputation are appropriate for the categorical items that make up surveys. However, methods of imputation specifically designed for categorical data are either limited in terms of the number of variables they can accommodate, or have not been fully compared with the continuous data approaches used with categorical variables. The goal of the current study was to compare the performance of these explicitly categorical imputation approaches with the more well established continuous method used with categorical item responses. Results of the simulation study based on real data demonstrate that the continuous based imputation approach and a categorical method based on stochastic regression appear to perform well in terms of creating data that match the complete datasets in terms of logistic regression results.
Abstract: We propose two classes of nonparametric point estimators of θ = P(X < Y ) in the case where (X, Y ) are paired, possibly dependent, absolutely continuous random variables. The proposed estimators are based on nonparametric estimators of the joint density of (X, Y ) and the distri bution function of Z = Y − X. We explore the use of several density and distribution function estimators and characterise the convergence of the re sulting estimators of θ. We consider the use of bootstrap methods to obtain confidence intervals. The performance of these estimators is illustrated us ing simulated and real data. These examples show that not accounting for pairing and dependence may lead to erroneous conclusions about the rela tionship between X and Y .
Abstract: In this paper, we use generalized influence function and generalized Cook distance to measure the local influence of minor perturbation on the modified ridge regression estimator in ridge type linear regression model. The diagnostics under the perturbation of constant variance and individual explanatory variables are obtained when multicollinearity presents among the regressors. Also we proposed a statistic that reveals the influential cases for Mallow’s method which is used to choose modified ridge regression estimator biasing parameter. Two real data sets are used to illustrate our methodologies.
In this paper, the geometric process model is used for analyzing constant stress accelerated life testing. The generalized half logistic lifetime distribution is considered under progressive type-II censoring. Statistical inference is developed on the basis of maximum likelihood approach for estimating the unknown parameters and getting both the asymptotic and bootstrap confidence intervals. Besides, the predictive values of the reliability function under usual conditions are found. Moreover, the method of finding the optimal value of the ratio of the geometric process is presented. Finally, a simulation study is presented to illustrate the proposed procedures and to evaluate the performance of the geometric process model.
Abstract: The creation of data sets using observational methods for the lag-sequential study of behavior requires selection of a recording time unit. This is an important issue, because standard methods such as momentary sampling and partial-interval sampling, for instance, consistently underestimate the frequency of some behaviors. This leads to inaccurate estimation of both unconditional and conditional probabilities of the different behaviors, the basic descriptive and analytic tools of sequential analysis methodology. The purpose of this paper is to investigate the creation of data sets usable for the purpose of sequential analysis. We show that such data vary depending on the time resolution and that inaccurate choices lead to biased estimations of transition probabilities.
Abstract: We develop a likelihood ratio test statistic, based on the betabinomial distribution, for comparing a single treated group with dichotomous data to dual control groups. This statistic is useful in cases where there is overdispersion or extra-binomial variation. We apply the statistic to data from a two year rodent carcinogenicity study with dual control groups. The test statistic we developed is similar to others that have been developed for incorporation of historical control groups with rodent carcinogenicity experiments. However, for the small sample case we considered, large sample theory used by the other test statistics did not apply. We determined the critical values of this statistic by enumerating its distribution. A small Monte Carlo study shows the new test statistic controls the significance level much better than Fisher’s exact test when there is overdispersion and that it has adequate power.