Bayesian random effects

Bayesian “random-effects” models

Stata 14 introduced bayesmh for fitting Bayesian models. You can choose from one of many built-in models or write your own. See Bayesian analysis and Programming your own Bayesian models for details.

In this article, we show you how to use bayesmh to fit a Bayesian “random-effects” model. We write “random effects” in quotes because all effects (parameters) are considered random within the Bayesian framework. These models are typically referred to as Bayesian multilevel or Bayesian hierarchical models.

We revisit, using the Bayesian approach, the random-effects meta-analysis model described in example 6 of [ME] me. The term “meta-analysis” refers to a statistical analysis that involves summarizing results from similar but independent studies.

We consider data from Turner et al. (2000) that contain estimates of treatment effects expressed as log odds-ratios (lnOR) and their respective variances (var) from nine clinical trials that examined the effect of taking diuretics during pregnancy on the risk of preeclampsia. Negative lnOR values indicate that taking diuretics lowers the risk of preeclampsia. The model can be written as

y i \sim N (μ i), σ 2 i (1) μ i \sim N (θ, τ 2)

where μi is the mean treatment effect of each trial, θ is an overall mean, σ2i is the variance of the observed treatment effect and is considered fixed, and τ2 is the between-trial variance. θ and τ2 are parameters of interest—τ2 close to 0 would suggest homogeneity across studies in log odds-ratio estimates.

In example 6 of [ME] me, we fit this random-effects model using meglm and obtain the estimates of θ and τ2 of −0.52 and 0.24 with their respective 95% confidence intervals of [−0.92, −0.11] and [0.048, 1.19].

For our Bayesian analysis, we need to additionally specify priors for θ and τ2 in model (1).

Notice that the random-effects model (1) already assumed a normal prior for each individual trial effect μi. We consider fairly noninformative normal and inverse gamma prior distributions for the parameters θ and τ2.

θ \sim N o r m a l (0, 10000) (2) τ 2 \sim I G (0.0001, 0.0001)

To fit model (1)–(2) using bayesmh, we type

. fvset base none trial
. bayesmh lnOR i.trial, noconstant likelihood(normal(var))
        prior({lnOR:i.trial}, normal({theta},{tau2}))
        prior({theta}, normal(0,10000))
        prior({tau2}, igamma(0.0001,0.0001))
        block({lnOR:i.trial}, split) block({theta}, gibbs)
        block({tau2}, gibbs) dots

The first four lines of the bayesmh specification are the straightforward model specification. The last two lines improve the efficiency of the sampler—a step that is crucial when fitting high-dimensional hierarchical models such as random-effects models. We sample all parameters separately by placing them in individual blocks and specifying option split for levels of trial, and we also request Gibbs sampling for θ and τ2.

Here is the output:

Because this is a random-effects example, we feel that a note of caution is in order. bayesmh uses an adaptive random-walk Metropolis–Hastings algorithm to fit all models (with potential Gibbs updates for some parameters). This algorithm provides an exceptionally flexible framework for fitting any Bayesian model. That flexibility comes with a price. The algorithm becomes inefficient for models with many parameters and may become prohibitively slow for some models. So, in theory, you can specify models with any number of levels of hierarchy and with any number of effects within a hierarchy. In practice, however, models with many random effects or many hierarchical levels may be infeasible. This poses no problem for our example with nine effects, but would likely be problematic for problems with thousands of random effects.

Because we used noninformative priors, our Bayesian results are similar to the frequentist results from example 6. For example, the posterior mean estimate of the overall mean θ is −0.51 with a 95% credible interval of [−1.01, −0.017] that represents the ranges to which θ belongs with a probability of 0.95.

We use bayesgraph to evaluate MCMC convergence visually. For example, here are diagnostics for θ and τ2:

. bayesgraph diagnostics {theta}, histopts(normal)

. 
bayesgraph diagnostics {tau2}

Our visual diagnostics raise no concern for these parameters. We can also check MCMC convergence for trial-specific μi’s by typing


. bayesgraph diagnostics {lnOR:i.trial}
  (output omitted)

We can test whether taking diuretics reduces the risk of preeclampsia overall by computing the probability that θ is negative.

That probability is 0.98.

If desired, we can also compute an estimate of the overall odds ratio.

A neat feature of our Bayesian analysis is that we can explore the distributions of the estimated parameters. For example, we can look at distributions of effects from individual trials.

. bayesgraph histogram {lnOR:i.trial},
        byparm(legend(off) noxrescale noyrescale
        title(Posterior distributions of trial effects))
        normal xline(-0.51)

The posterior distributions are fairly normal for most trials. Posteriors for some trials are more closely centered on the overall mean of −0.51, particularly for trials 2 and 7.

There is a noticeable variability in the estimated treatment effects between trials. Trials 6 and 9 are more precise compared with other trials.

We can also test for an effect in each trial. We can estimate a probability that an effect is negative (meaning diuretics work) for each trial. For example,

The probability of a negative treatment effect is almost 1 for trial 2 and only about 0.55 for trial 9.

Reference

Turner, R. M., R. Z. Omar, M. Yang, H. Goldstein, and S. G. Thompson. 2000. A multilevel model framework for meta-analysis of clinical trials with binary outcomes. Statistics in Medicine 19: 3417–3432.

— Yulia Marchenko
Director of Biostatistics, StataCorp

— Nikolay Balov
Senior Statistician and Software Developer, StataCorp