MAMI

MAMI - An package for Model Averaging (and Model Selection) after Multiple Imputation

$\rightarrow$ Background

$\rightarrow$ Software

$\rightarrow$ Further considerations and literature

Background

Model Averaging

The motivation for variable selection in regression models is based on the rationale that associational relationships between variables are best understood by reducing the model’s dimension. The problem with this approach is that (i) regression parameters after model selection are often biased and (ii) the respective standard errors are too small because they do not reflect the uncertainty related to the model selection process. It has been proposed that the drawback of model selection can be overcome by model averaging. With model averaging, one calculates a weighted average $\hat{\bar{\beta}} = \sum_{\kappa=1}^{k} w_{\kappa} \hat{\beta}_{\kappa}$ from $k$ parameter estimates of a set of candidate (regression) models $\mathcal{M}={M_1,\ldots,M_k}$, where the weights are calculated in a way such that better models receive a higher weight.

A popular weight choice would be based on the exponential AIC,

\begin{eqnarray} w_{\kappa}^{\text{AIC}} &=& \frac{\exp(-\frac{1}{2} \mathrm{AIC_{\kappa})}}{\sum_{\kappa=1}^k\exp(-\frac{1}{2} \mathrm{AIC_{\kappa})}} , \end{eqnarray}

where $\mathrm{AIC_{\kappa}}$ is the AIC value related to model $M_{\kappa}\in\mathcal{M}$ and $\sum_{\kappa} w_{\kappa}^{\text{AIC}} =1$. It has been suggested to estimate the variance of the scalar $\hat{\bar{\beta}}_j \in \hat{\bar{{\beta}}}$ via

\begin{eqnarray} \widehat{\text{Var}}(\hat{\bar{\beta}}j) &=& {\sum{\kappa=1}^k w_{\kappa} \sqrt{\widehat{\text{Var}}(\hat{\beta}{j,\kappa}|M{\kappa}) + (\hat{\beta}_{j,\kappa}-\hat{\bar{\beta}}_j})^2}^2 , \end{eqnarray}

where $\hat{\beta}_{j,\kappa}$ is the j$^{th}$ regression coefficient of the $\kappa^{th}$ candidate model. This approach tackles problem (ii), the incorporation of model selection uncertainty into the standard errors of the regression parameters; but it may not necessarily tackle problem (i) as the regression parameters may still be biased. There are multiple different suggestions on how the weights can be calculated, and those implemented in “MAMI” are explained in the manual. Note that model selection can be viewed as a special case of model averaging where the ``best'' model receives weight $1$ (and all others a weight of $0$). All implemented model selection options are listed in the manual too.

Multiple Imputation

Multiple imputation (MI) is a popular method to address missing data. Based on assumptions about the data distribution (and the mechanism which gives rise to the missing data) missing values can be imputed by means of draws from the posterior predictive distribution of the unobserved data given the observed data. This procedure is repeated to create $M$ imputed data sets, the (regression) analysis is then conducted on each of these data sets and the $M$ results ($M$ point and $M$ variance estimates) are combined by a set of simple rules:

\begin{eqnarray} \hat{\beta}_{\text{MI}} &=& \frac{1}{M} \sum_{m=1}^M \hat{\beta}^{(m)} \end{eqnarray}

and \begin{eqnarray} \widehat{{\text{Cov}}}(\hat{\beta}_{\text{MI}}) &=& \frac{1}{M} \sum_{m=1}^{M} \widehat{\text{Cov}}(\hat{\beta}^{(m)}) + \frac{M+1}{M(M-1)} \sum_{m=1}^{M} (\hat{\beta}^{(m)}-\hat{\beta}_{\text{MI}}) (\hat{\beta}^{(m)}-\hat{\beta}_{\text{MI}})^{'} \end{eqnarray}

where $ \hat{\beta}^{(m)}$ refers to the estimate of $\beta$ in the $\text{m}^{th}$ imputed set of data. Confidence intervals are based on a particular $t$-distribution.

Model Averaging (or Model Selection) after Multiple Imputation

How can model averaging and model selection be applied to multiply imputed data? The detailed motivation can be found in the 2014 reference below. The basic results for model averaging are \begin{eqnarray} \hat{\bar{\beta}}_{\text{MI}} &=& \frac{1}{M} \sum_{m=1}^{M} \hat{\bar{\beta}}^{(m)}\quad \text{with} \quad \hat{\bar{{\beta}}}^{(m)} = \sum_{\kappa=1}^{k} w_{\kappa}^{(m)} \hat{{\beta}}_{\kappa}^{(m)} \end{eqnarray} and applies to any weight choice. If the variance of the model averaging estimator is estimated via the formula given above, the overall variance of the estimator after multiple imputation relates to \begin{eqnarray} \widehat{\text{Var}}(\hat{\bar{\beta}}_{j,\text{MI}}) &=& \frac{1}{M} \sum_{m=1}^{M} {\sum_{\kappa=1}^k w_{\kappa} \sqrt{\widehat{\text{Var}}(\hat{\beta}_{j,\kappa}^{(m)})+(\hat{\beta}_{j,\kappa}^{(m)}-\hat{\bar{\beta}}_j^{(m)})^2} }^2 + \newline &&\frac{M+1}{M(M-1)} \sum_{m=1}^{M} (\hat{\bar\beta}_j^{(m)}-\hat{\bar\beta}_{j,\text{MI}})^2 . \end{eqnarray} Confidence intervals could then again be estimated based on a $t$ distribution (as explained above) or, alternatively, via bootsrapping (see 2014 reference below, and this publication of mine for a general Bootstrap-MI framework).

Model selection after imputation works essentially the same, except that parameters associated with variables which have not been selected are assumed to be $0$. With this assumption, a variable will be formally selected if it is selected in at least one imputed set of data, but its overall impact will depend on how often it is chosen. Here, confidence intervals will almost always be too narrow if the formula from above will be applied (because of model selection uncertainty) and bootstrapping is strongly recommended.

As a consequence for model selection (and model averaging), effects of variables which are not supported throughout imputed data sets (and candidate models) will simply be less pronounced.

Software

MAMI estimates the point estimates as explained above, together with confidence intervals that are either based on the formula above, or a Bayesian variation thereof, or bootstrapping (preferred option).

In addition, a variable importance measure (averaged over the $M$ imputed data sets) will be calculated: this measure simply sums up the weights $w_{\kappa}$ of those candidate models $M_{\kappa}$ that contain the relevant variable, and lies between $0$ (unimportant) and $1$ (very important). It is similar to the Bayesian posterior effect probability. Results can be interpreted and reported as suggested in the manual.

MAMI’s main function is mami(). It is recommended to get familiar with the function’s syntax by typing ?mami, running the examples at the bottom of the help page and read through the manual.

  • The package can be downloaded here. Install it via
install.packages("MAMI", 
	repos=c("http://R-Forge.R-project.org","http://cran.at.r-project.org"),
	dependencies=TRUE)
  • The manual can be found here.
  • The package provides access to less frequently used model averaging techniques, offers integrated bootstrap estimation and easy-to-use parallelization.
  • Optimal model averaging procedures are implemented too, and wrappers for their use in super learning are integrated into the package. See also here.

Further considerations

Interpretation of regression parameters requires care: for any explanatory interpretation causal considerations need to be taken into account (see my paper on regression and causailty here ). However, both model averaging and post model selection estimators are biased; thus, in most cases a causal interpretations are invalid. Even when confidence interval coverage is improved, as explained above, it does not change the fact that in many cases model averaging and model selection estimators are more suitable for predictive tasks. Modern doubly robust etimators, such as targeted maximum likelihood estimators, can integrate data-adaptive approaches which are based on predictive considerations, while still retaining valid inference. Therefore, the integration of model averaging estimators in machine learning approaches which are used for causal effect estimation can be attractive as explained in this recent paper of mine. MAMI offers a couple of wrappers that can be used to integrate different model averaging approaches into super learning, a data-adaptive approach which uses a weighted combination of different learning algorithms and is popular in causal inference. See Section 6.3. of the manual.

Related