Abstract: Incomplete data are common phenomenon in research that adopts the longitudinal design approach. If incomplete observations are present in the longitudinal data structure, ignoring it could lead to bias in statistical inference and interpretation. We adopt the disposition model and extend it to the analysis of longitudinal binary outcomes in the presence of monotone incomplete data. The response variable is modeled using a conditional logistic regression model. The nonresponse mechanism is assumed ignorable and developed as a combination of Markov’s transition and logistic regression model. MLE method is used for parameter estimation. Application of our approach to rheumatoid arthritis clinical trials is presented.
Abstract: For estimating bivariate survival function under random censor ship, it is commonly believed that the Dabrowska estimator is among the best ones while the Volterra estimator is far from being computational ef ficiency. As we will see, the Volterra estimator is a natural extension of the Kaplan-Meier estimator to bivariate data setting. We believe that the computational ‘inefficiency’ of the Volterra estimator is largely due to the formidable computational complexity of the traditional recursion method. In this paper, we show by numerical study as well as theoretical analysis that the Volterra estimator, once computed by dynamic programming technique, is more computationally efficient than the Dabrowska estimator. Therefore, the Volterra estimator with dynamic programming would be quite recom mendable in applications owing to its significant computational advantages.
The generalized exponentiated exponential Lindley distribution is a novel three parameter distribution due to Hussain et al. (2017). They studied its properties including estimation issues and illustrated applications to four datasets. Here, we show that several known distributions including those having two parameters can provide better fits. We also correct errors in the derivatives of the likelihood function.
Abstract: We have developed an enhanced spike and slab model for variable selection in linear regression models via restricted final prediction error (FPE) criteria; classic examples of which are AIC and BIC. Based on our proposed Bayesian hierarchical model, a Gibbs sampler is developed to sample models. The special structure of the prior enforces a unique mapping between sampling a model and calculating constrained ordinary least squares estimates for that model, which helps to formulate the restricted FPE criteria. Empirical comparisons are done to the lasso, adaptive lasso and relaxed lasso; followed by a real life data example.
Abstract: Missing data are a common problem for researchers working with surveys and other types of questionnaires. Often, respondents do not respond to one or more items, making the conduct of statistical analyses, as well as the calculation of scores difficult. A number of methods have been developed for dealing with missing data, though most of these have focused on continuous variables. It is not clear that these techniques for imputation are appropriate for the categorical items that make up surveys. However, methods of imputation specifically designed for categorical data are either limited in terms of the number of variables they can accommodate, or have not been fully compared with the continuous data approaches used with categorical variables. The goal of the current study was to compare the performance of these explicitly categorical imputation approaches with the more well established continuous method used with categorical item responses. Results of the simulation study based on real data demonstrate that the continuous based imputation approach and a categorical method based on stochastic regression appear to perform well in terms of creating data that match the complete datasets in terms of logistic regression results.
Abstract: We propose two classes of nonparametric point estimators of θ = P(X < Y ) in the case where (X, Y ) are paired, possibly dependent, absolutely continuous random variables. The proposed estimators are based on nonparametric estimators of the joint density of (X, Y ) and the distri bution function of Z = Y − X. We explore the use of several density and distribution function estimators and characterise the convergence of the re sulting estimators of θ. We consider the use of bootstrap methods to obtain confidence intervals. The performance of these estimators is illustrated us ing simulated and real data. These examples show that not accounting for pairing and dependence may lead to erroneous conclusions about the rela tionship between X and Y .
Abstract: In this paper, we use generalized influence function and generalized Cook distance to measure the local influence of minor perturbation on the modified ridge regression estimator in ridge type linear regression model. The diagnostics under the perturbation of constant variance and individual explanatory variables are obtained when multicollinearity presents among the regressors. Also we proposed a statistic that reveals the influential cases for Mallow’s method which is used to choose modified ridge regression estimator biasing parameter. Two real data sets are used to illustrate our methodologies.
In this paper, the geometric process model is used for analyzing constant stress accelerated life testing. The generalized half logistic lifetime distribution is considered under progressive type-II censoring. Statistical inference is developed on the basis of maximum likelihood approach for estimating the unknown parameters and getting both the asymptotic and bootstrap confidence intervals. Besides, the predictive values of the reliability function under usual conditions are found. Moreover, the method of finding the optimal value of the ratio of the geometric process is presented. Finally, a simulation study is presented to illustrate the proposed procedures and to evaluate the performance of the geometric process model.
Abstract: The creation of data sets using observational methods for the lag-sequential study of behavior requires selection of a recording time unit. This is an important issue, because standard methods such as momentary sampling and partial-interval sampling, for instance, consistently underestimate the frequency of some behaviors. This leads to inaccurate estimation of both unconditional and conditional probabilities of the different behaviors, the basic descriptive and analytic tools of sequential analysis methodology. The purpose of this paper is to investigate the creation of data sets usable for the purpose of sequential analysis. We show that such data vary depending on the time resolution and that inaccurate choices lead to biased estimations of transition probabilities.