ITERATED SUFFICIENT M-OUT-OF-N (M/N ) BOOTSTRAP FOR NON-REGULAR SMOOTH FUNCTION MODELS

: It is well known that under certain regularity conditions the boot- strap sampling distributions of common statistics are consistent with their true sampling distributions. However, the consistency results rely heavily on the underlying regularity conditions and in fact, a failure to satisfy some of these may lead us to a serious departure from consistency. Consequently, the ‘sufficient bootstrap’ method (which only uses distinct units in a bootstrap sample in order to reduce the computational burden for larger sample sizes) based sampling distributions will also be inconsistent. In this paper, we combine the ideas of sufficient and m-out-of-n (m/n) bootstrap methods to regain consistency. We further propose the iterated version of this bootstrap method in non-regular cases and our simulation study reveals that similar or even better coverage accuracies than percentile bootstrap confidence inter- vals can be obtained through the proposed iterated sufficient m/n bootstrap with less computational time each case.

Recently, Singh and Sedory (2011) propose the sufficient bootstrap method that only uses the distinct units in a simple random sampling with replacement bootstrap sample of size n to reduce the computational burden and leads us to make better inferences in certain cases.
Although it is consistent as long as the traditional bootstrap works, it becomes inconsistent in case of the inconsistency of the traditional bootstrap. As a solution, the idea of m/n bootstrap may be combined with sufficient bootstrapping to regain the consistency, see for example Alin et al. (2017). In particular, let X1, X2, · · · is a sequence of independent and identically distributed (i.i.d.) random variables from an unknown distribution F.Also, let x n = (X 1 , … , X n )and x n * = (X 1 * , … , X n * )be an i.i.d. random sample from F and the bootstrap resample, respectively. To perform sufficient m/n bootstrap, we take a random sample of size m = o(n), x m * = (X 1 * , … , X m * ), from x n but use only distinct observations. In this case, let V n * and V m * be the number of distinct observations in x n * andx m * ,respectively. Note that every unit in x n has probability 1 − (1 − 1/n) n to appear in a bootstrap sample. Consequently, it can be shown that the expected size of the usual and sufficient m/n bootstrap resamples are approximately E(V n * ) Further, the iterated bootstrap method (see, Hall (1986)) can be useful in obtaining a higher degree of correction, for example to coverage accuracy, by using a second-level of bootstrap resamples to estimate and subsequently derive a correction for the coverage error in the original bootstrap procedure. Hall (1986), Beran (1987) DiCiccio and Romano (1988), and Hall and Martin (1988) provide theoretical properties of this method and prove that the iterating principle reduces the bootstrap errors in many statistical problems. In this study, we pro-pose a combination of the iterated bootstrap with sufficient and m/n bootstrap methods (termed as iterated sufficient m/n bootstrap) with an aim to reduce the coverage error of the percentile confidence interval in non-regular cases.
The rest of the paper is organized as follows. In Section 2, we provide a detailed information about the proposed method. In Section 3, the asymptotic expansions of the coverage probabilities for non-iterated and iterated sufficient m/n bootstrap methods are given. To evaluate the finite-sample performance of the proposed method, we consider the non-regular case "function of means with null first-order differential" given by Shao (1994) and the results are also presented in Section 3. Finally, we conclude with some final remarks described in Section 4.

Iterated sufficient m/n bootstrap method
Let X 1 , X 2 , … be a sequence of i.i.d. random variables from an unknown distribution F ≡ F θ , where the parameter θ is of our primary interest. Let X 1 , X 2 , …· be a sequence of i.i.d. random variables from an unknown distribution F ≡ F θ , where the parameter θ is of our primary interest.
Let x n = (X 1 , … , X n ) be an i.i.d. random sample from F , and let R n (x n , θ) be an approximately pivotal quantity whose distribution is given by G n = G n (•, F). In many cases of practical interest (e.g. a location-scale setting), the quantity R n (x n , θ) generally depends not Beyaztas Ufuk1, Alin Aylin2, Bandyopadhyay Soutir3 595 only on the data and θ, but also on nuisance parameter (such as a scale parameter σ). Suppose Θ is the set of all possible values of θ. Then, a level α confidence set for the parameter θ can be obtained as for any given αϵ(0,1), where G n −1 (α) describes the largest α-th quantile of ttn. For any sequence {Fn} which converges to F , G n = G(•, F n ) is supposed to converge weakly to a continuous distribution function G = G(•, F). Then, G n (R n (x n , θ)) is distributed as uniform U(0,1). In classical theory, G n is approximated by its limit. However, in most cases, it is not easy to obtain its limit when the estimate of the parameter is a complicated statistic. But bootstrap method makes it possible since it does not require the full knowledge of the underlying distribution. Let x n * = (X 1 * , … , X n * ) be the bootstrap sample from F n , where F n is the empirical distribution function which puts mass 1/n to each data point. Let also θ be the estimate of θ based on x n . Then, the bootstrap analogue of R n (x n , θ) with the bootstrap distribution conditional on x n are given as R n * = R n (x n * , θ) and G n * = G n * (•, F n ) , respectively. Similar to the Equation 1, the bootstrap estimate of Sn is obtained as Since is a consistent estimate for F , the bootstrap estimate * converges in probability to G as n increases. Moreover, * ( ( * ,̂)) converges to a uniform U(0,1) distribution.
The iterating principle based on Beran (1987)'s prepivoting idea, transforms R n (x n , θ) into R n,1 (x n , θ) = G n * {R n (x n , θ)} whose distribution is less dependent to F compared to G n (x).

Results
In this section we consider the non-regular case described in Shao (1994) where the traditional and sufficient bootstrap methods fail to provide consistent results. First, in Section 3.1, we give the detailed information about this inconsistency problem and the behavior of sufficient m/n bootstrap to establish this method's usefulness to avoid the vexing issue of inconsistency. A finite sample comparative study has been presented in Section 3.2.

The Problem
Let us consider the smooth function model described by Bhattacharya and Ghosh (1978). Suppose the data X 1 , … , X n are i.i.d. from F with mean µ. Let θ = g(μ) with its estimate θ n = g(X n ̅̅̅̅ ), and further suppose that the asymptotic variance of n 1/2 θ n admits the representation h(μ) with the estimate h(X n ̅̅̅̅ ) for some smooth functions g and h on R d . Suppose that g and h are continuously differentiable up to a sufficiently high order in an open neighborhood of µ, and F satisfies Cramer's condition with sufficiently many finite moments. Let us assume that ∇g(μ) = 0 and ∇ 2 g(μ) ≠ 0 where ∇g is the gradient and ∇ 2 g is the Hessian matrix of g. In this case instead of √n-consistency, θ n is n-consistent with a limiting distribution which approximates to a chi-squared type distribution.

A numerical study
We consider the exponential example given by Cheung et.al (2005)

Conclusion
In this study, we propose introducing the iterating principle in the sufficient m/n bootstrap method to improve the coverage accuracy of bootstrap percentile confidence intervals in non-regular smooth function models. The results of simulation study show that similar or even better coverage accuracy of iterated m/n bootstrap can be obtained by using iterated sufficient m/n bootstrap method. The iteration increases the computing time since double bootstrap is used. On the other hand, using only distinct units in resamples will reduce the computational burden which is an important result of this study. Our records (only for one simulation) show that the computing times are roughly 3.36 min for iterated m/n bootstrap and 2.56 min for the sufficient bootstrap version, respectively, when B1 = B2 = 1000 and n = 20000. That is, by using sufficient bootstrap the computational burden of iterated m/n bootstrap can be reduced roughly by 24%.
As a future study, the performance of the proposed method can also be stud-ied studied for testing mixture models or for examining the symmetry of the underlying distribution as an alternative to the