Nonparametric bias-corrected and accelerated bootstrap confidence limits

This function is a version of bcajack that allows all the recomputations of the original statistic function \(f\) to be carried out separately. This is an advantage if \(f\) is time-consuming, in which case the B replications for the nonparametric bca calculations might need to be done on a distributed basis.

To use bcajack2 in this mode, we first compute a list Blist via Blist <- list(Y = Y, tt = tt, t0 = t0). Here tt is a vector of length B having i-th entry tt[i] <- func(x[Ii,], ...), where x is the \(n \times p\) data matrix and Ii is a bootstrap vector of (observation) indices. Y is a B by \(n\) count matrix, whose i-th row is the counts corresponding to Ii. For example if n = 5 and Ii = (2, 5, 2, 1, 4), then Yi = (1, 2, 0, 1, 1). Having computed Blist, bcajack2 is invoked as bcajack2(Blist) without need to enter the function \(func\).

bcajack2(x, B, func, ..., m = nrow(x), mr, pct = 0.333, K = 2,
  J = 12, alpha = c(0.025, 0.05, 0.1, 0.16), verbose = TRUE)

Arguments

x	an \(n \times p\) data matrix, rows are observed \(p\)-vectors, assumed to be independently sampled from target population. If \(p\) is 1 then `x` can be a vector.
B	number of bootstrap replications. `B` can also be a vector of `B` bootstrap replications of the estimated parameter of interest, computed separately. If `B` is `Blist` as explained above, `x` is not needed.
func	function \(\hat{\theta}=func(x)\) computing estimate of the parameter of interest; \(func(x)\) should return a real value for any \(n^\prime \times p\) matrix \(x^\prime\), \(n^\prime\) not necessarily equal to \(n\)
...	additional arguments for `func`.
m	an integer less than or equal to \(n\); the routine collects the \(n\) rows of `x` into `m` groups to speed up the jackknife calculations for estimating the acceleration value \(a\); typically `m` is 20 or 40 and does not have to exactly divide \(n\). However, warnings will be shown.
mr	if \(m < n\) then `mr` repetions of the randomly grouped jackknife calculations are averaged.
pct	`bcajack2` uses those count vectors nearest (1,1,...1) to estimate the gradient of the statistic, "nearest" being defined as those count vectors in the smallest `pct` of all B of them. Default value for `pct is 1/3 (see appendix in Efron and Narasimhan for further details)
K	a non-negative integer. If `K` > 0, bcajack also returns estimates of internal standard error, that is, of the variability due to stopping at `B` bootstrap replications rather than going on to infinity. These are obtained from a second type of jackknifing, taking an average of `K` separate jackknife estimates, each randomly splitting the `B` bootstrap replications into `J` groups.
J	the number of groups into which the bootstrap replications are split
alpha	percentiles desired for the bca confidence limits. One only needs to provide `alpha` values below 0.5; the upper limits are automatically computed
verbose	logical for verbose progress messages

Value

a named list of several items

lims : first column shows the estimated bca confidence limits at the requested alpha percentiles. These can be compared with the standard limits \(\hat{\theta} + \hat{\sigma}z_{\alpha}\), third column. The second column jacksd gives the internal standard errors for the bca limits, quite small in the example. Column 4, pct, gives the percentiles of the ordered B bootstrap replications corresponding to the bca limits, eg the 897th largest replication equalling the .975 bca limit .557.
stats : top line of stats shows 5 estimates: theta is \(func(x)\), original point estimate of the parameter of interest; sdboot is its bootstrap estimate of standard error; z0 is the bca bias correction value, in this case quite negative; a is the acceleration, a component of the bca limits (nearly zero here); sdjack is the jackknife estimate of standard error for theta. Bottom line gives the internal standard errors for the five quantities above. This is substantial for z0 above.
B.mean : bootstrap sample size B, and the mean of the B bootstrap replications \(\hat{\theta^*}\)
ustats : The bias-corrected estimator 2 * t0 - mean(tt), and an estimate sdu of its sampling error
seed : The random number state for reproducibility

Examples

data(diabetes, package = "bcaboot")
Xy <- cbind(diabetes$x, diabetes$y)
rfun <- function(Xy) {
  y <- Xy[, 11]
  X <- Xy[, 1:10]
  summary(lm(y~X) )$adj.r.squared
}
set.seed(1234)
bcajack2(x = Xy, B = 1000, func = rfun, m = 40, verbose = FALSE)
#> $call
#> bcajack2(x = Xy, B = 1000, func = rfun, m = 40, verbose = FALSE)
#> 
#> $lims
#>             bca      jacksd       std   pct
#> 0.025 0.4077984 0.007394224 0.4377771 0.002
#> 0.05  0.4332593 0.010170910 0.4488356 0.007
#> 0.1   0.4463550 0.003897557 0.4615854 0.021
#> 0.16  0.4551329 0.003615474 0.4716607 0.041
#> 0.5   0.4932018 0.002052666 0.5065603 0.237
#> 0.84  0.5270034 0.001613442 0.5414600 0.609
#> 0.9   0.5364805 0.001790711 0.5515353 0.711
#> 0.95  0.5481595 0.002170036 0.5642850 0.818
#> 0.975 0.5587796 0.003204113 0.5753435 0.887
#> 
#> $stats
#>         theta       sdboot          z0             a       sdjack
#> est 0.5065603 0.0350941304 -0.35578711 -0.0139428485 0.0359595105
#> jsd 0.0000000 0.0007234153  0.03110472  0.0008423517 0.0002927981
#> 
#> $B.mean
#> [1] 1000.0000000    0.5174461
#> 
#> $ustats
#>      ustat        sdu 
#> 0.49567452 0.03667226 
#> 
#> attr(,"class")
#> [1] "bcaboot"

Nonparametric bias-corrected and accelerated bootstrap confidence limits

Arguments

Value

Examples

Contents