R/bcajack.R
bcajack.Rd
This routine computes nonparametric confidence intervals for bootstrap estimates. For reproducibility, save or set the random number state before calling this routine.
bcajack(x, B, func, ..., m = nrow(x), mr = 5, K = 2, J = 10, alpha = c(0.025, 0.05, 0.1, 0.16), verbose = TRUE)
x | an \(n \times p\) data matrix, rows are observed
\(p\)-vectors, assumed to be independently sampled from
target population. If \(p\) is 1 then |
---|---|
B | number of bootstrap replications. It can also be a vector
of |
func | function \(\hat{\theta}=func(x)\) computing estimate of the parameter of interest; \(func(x)\) should return a real value for any \(n^\prime \times p\) matrix \(x^\prime\), \(n^\prime\) not necessarily equal to \(n\) |
... | additional arguments for |
m | an integer less than or equal to \(n\); the routine
collects the \(n\) rows of |
mr | if \(m < n\) then |
K | a non-negative integer. If |
J | the number of groups into which the bootstrap replications are split |
alpha | percentiles desired for the bca confidence limits. One
only needs to provide |
verbose | logical for verbose progress messages |
a named list of several items
lims : first column shows the estimated bca confidence limits
at the requested alpha percentiles. These can be compared with
the standard limits \(\hat{\theta} +
\hat{\sigma}z_{\alpha}\), third column. The second column
jacksd
gives the internal standard errors for the bca limits,
quite small in the example. Column 4, pct
, gives the
percentiles of the ordered B bootstrap replications
corresponding to the bca limits, eg the 897th largest
replication equalling the .975 bca limit .557.
stats : top line of stats shows 5 estimates: theta is
\(f(x)\), original point estimate of the parameter of
interest; sdboot
is its bootstrap estimate of standard error;
z0
is the bca bias correction value, in this case quite
negative; a
is the acceleration, a component of the bca
limits (nearly zero here); sdjack
is the jackknife estimate
of standard error for theta. Bottom line gives the internal
standard errors for the five quantities above. This is
substantial for z0
above.
B.mean : bootstrap sample size B, and the mean of the B bootstrap replications \(\hat{\theta^*}\)
ustats : The bias-corrected estimator 2 * t0 - mean(tt)
,
and an estimate sdu
of its sampling error
seed : The random number state for reproducibility
Bootstrap confidence intervals depend on three elements:
the cdf of the \(B\) bootstrap replications \(t_i^*\), \(i=1\ldots B\)
the bias-correction number \(z_0=\Phi(\sum_i^B I(t_i^* < t_0) / B )\) where \(t_0=f(x)\) is the original estimate
the acceleration number \(a\) that measures the rate of change in \(\sigma_{t_0}\) as \(x\), the data changes.
The first two of these depend only on the bootstrap distribution,
and not how it is generated: parametrically or
non-parametrically. Program bcajack can be used in a hybrid fashion
in which the vector tt
of B bootstrap replications is first
generated from a parametric model.
So, in the diabetes example below, we might first draw bootstrap
samples \(y^* \sim N(X\hat{\beta}, \hat{\sigma}^2 I)\) where
\(\hat{\beta}\) and \(\hat{\sigma}\) were obtained from
lm(y~X)
; each \(y^*\) would then provide a bootstrap
replication tstar = rfun(cbind(X, ystar))
. Then we could get bca
intervals from bcajack(Xy, tt, rfun ....)
with tt
,
the vector of B tstar
values. The only difference from a full
parametric bca analysis would lie in the nonparametric estimation
of \(a\), often a negligible error.
DiCiccio T and Efron B (1996). Bootstrap confidence intervals. Statistical Science 11, 189-228
Efron B (1987). Better bootstrap confidence intervals. JASA 82 171-200
B. Efron and B. Narasimhan. Automatic Construction of Bootstrap Confidence Intervals, 2018.
data(diabetes, package = "bcaboot") Xy <- cbind(diabetes$x, diabetes$y) rfun <- function(Xy) { y <- Xy[, 11] X <- Xy[, 1:10] summary(lm(y~X) )$adj.r.squared } set.seed(1234) ## n = 442 = 34 * 13 bcajack(x = Xy, B = 1000, func = rfun, m = 34, verbose = FALSE)#> $call #> bcajack(x = Xy, B = 1000, func = rfun, m = 34, verbose = FALSE) #> #> $lims #> bca jacksd std pct #> 0.025 0.4397641 0.002865222 0.4436815 0.007 #> 0.05 0.4473747 0.003481831 0.4537907 0.016 #> 0.1 0.4577052 0.002144860 0.4654461 0.038 #> 0.16 0.4665482 0.002518510 0.4746565 0.068 #> 0.5 0.4983510 0.001417285 0.5065603 0.311 #> 0.84 0.5305117 0.003404510 0.5384641 0.692 #> 0.9 0.5412166 0.001847288 0.5476746 0.785 #> 0.95 0.5501776 0.002575333 0.5593299 0.875 #> 0.975 0.5624409 0.002941915 0.5694391 0.929 #> #> $stats #> theta sdboot z0 a sdjack #> est 0.5065603 0.0320816221 -0.24558952 0.0004269264 0.0210771 #> jsd 0.0000000 0.0005108335 0.04423824 0.0000000000 0.0000000 #> #> $B.mean #> [1] 1000.0000000 0.5144936 #> #> $ustats #> ustat sdu #> 0.49862706 0.03984912 #> #> attr(,"class") #> [1] "bcaboot"