Bootstrap inference when using multiple imputation

Publication
Statistics in Medicine, 37(14): pp.2252-2266

For many analyses, it is common to use both bootstrapping and multiple imputation (MI): MI to address missing data and bootstrapping to obtain standard errors. For example, when using the g-formula in causal inference, bootstrapping is required to obtain standard errors; however, the data may be multiply imputed due to missing (baseline) data in the population of interest. How should bootstrapping and multiple imputation then be combined? First bootstrapping ($B$ times, including missing data) and then impute ($M$ times); or first impute and then bootstrap the multiply imputed data? For the latter approach, one could use bootstrapping to estimate the standard error in each imputed data set and apply the standard MI combining rules (“Rubin’s rules”); alternatively, the $B \times M$ estimates could be pooled and 95% confidence intervals could be calculated based on the 2.5th and 97.5th percentiles of the respective empirical distribution. For the former approach, either multiple imputation combining rules can be applied to the imputed data of each bootstrap sample to obtain $B$ point estimates, which in turn, may be used to construct confidence intervals, or the $B \times M$ estimates of the pooled data are used for interval estimation.

In our paper, we evaluated the above four approaches and found that three are generally valid whereas one - first imputing, then bootstrapping, and then pooling the estimates - is invalid. We describe the advantages, disadvantages and implications of all approaches and give guidance which approach may be suitable in which context. Under the code tag above, an example on how to implement the four respective approaches is contained as well as code of the simulation studies from the paper.

What is nice about this manuscript is not only that it provides practical guidance, but also that it sparked some other interesting research. For instance, Bartlett and Hughes consider the combination of imputation and bootstrapping when imputation and analysis models are uncongenial or misspecified (see here). Interestingly, they find that bootstrapping, followed by imputation, and not pooling estimates, is the preferred approach in this case. This is in line with our suggested approach on model selection and averaging with multiply imputed data (see here). Also, van Hippel and Bartlett proposed an alternative, computationally efficient, point estimator and confidence interval when bootstraping followed by MI, which is implemented in an R-package ( bootImpute ).

It is nice to see this work being used and further developed; also after this manuscript got initially rejected by an applied stats journal because a reviewer insisted: “There is nothing really new in the paper”.

Related