Abstract: In this study, we compared various block bootstrap methods in terms of parameter estimation, biases and mean squared errors (MSE) of the bootstrap estimators. Comparison is based on four real-world examples and an extensive simulation study with various sample sizes, parameters and block lengths. Our results reveal that ordered and sufficient ordered non-overlapping block bootstrap methods proposed by Beyaztas et al. (2016) provide better results in terms of parameter estimation and its MSE compared to conventional methods. Also, sufficient non-overlapping block bootstrap method and its ordered version have the smallest MSE for the sample mean among the others.
It is well known that under certain regularity conditions the boot- strap sampling distributions of common statistics are consistent with their true sampling distributions. However, the consistency results rely heavily on the underlying regularity conditions and in fact, a failure to satisfy some of these may lead us to a serious departure from consistency. Consequently, the ‘sufficient bootstrap’ method (which only uses distinct units in a bootstrap sample in order to reduce the computational burden for larger sample sizes) based sampling distributions will also be inconsistent. In this paper, we combine the ideas of sufficient and m-out-of-n (m/n) bootstrap methods to regain consistency. We further propose the iterated version of this bootstrap method in non-regular cases and our simulation study reveals that similar or even better coverage accuracies than percentile bootstrap confidence inter- vals can be obtained through the proposed iterated sufficient m/n bootstrap with less computational time each case.
Abstract: We propose two classes of nonparametric point estimators of θ = P(X < Y ) in the case where (X, Y ) are paired, possibly dependent, absolutely continuous random variables. The proposed estimators are based on nonparametric estimators of the joint density of (X, Y ) and the distri bution function of Z = Y − X. We explore the use of several density and distribution function estimators and characterise the convergence of the re sulting estimators of θ. We consider the use of bootstrap methods to obtain confidence intervals. The performance of these estimators is illustrated us ing simulated and real data. These examples show that not accounting for pairing and dependence may lead to erroneous conclusions about the rela tionship between X and Y .
Abstract: Scientific interest often centers on characterizing the effect of one or more variables on an outcome. While data mining approaches such as random forests are flexible alternatives to conventional parametric models, they suffer from a lack of interpretability because variable effects are not quantified in a substantively meaningful way. In this paper we describe a method for quantifying variable effects using partial dependence, which produces an estimate that can be interpreted as the effect on the response for a one unit change in the predictor, while averaging over the effects of all other variables. Most importantly, the approach avoids problems related to model misspecification and challenges to implementation in high dimensional settings encountered with other approaches (e.g., multiple linear regression). We propose and evaluate through simulation a method for constructing a point estimate of this effect size. We also propose and evaluate interval estimates based on a non-parametric bootstrap. The method is illustrated on data used for the prediction of the age of abalone.