Bayesian Model Averaging (BMA) for Variable Selection

Background on BMA:

Traditional model building strategies often use stepwise variable selection to choose candidate covariates, but stepwise methods can perform poorly, resulting in biased estimates and overly narrow confidence intervals, among other problems (see Harrell, 2001). An alternative approach is Bayesian Model Averaging (BMA), which accounts for the uncertainty in variable selection by averaging over the best models. In practice, we can use BMA to identify predictors with high posterior probabilities.

 

References:

Harrell, F. E. (2001), Regression modeling strategies: With applications to linear models, logistic regression, and survival analysis, Springer-Verlag, New York.

Hoeting JA, Madigan D, Raftery AE, Volinsky CT. Bayesian model averaging: a tutorial (with comments by M. Clyde, David Draper and E. I. George, and a rejoinder by the authors. Stat Sci. 1999;14(4):382-417.

Helpful websites for further reading:

BMA for linear regression: https://rdrr.io/cran/BMA/man/bicreg.html

BMA for generalized linear models, including binary outcomes: https://rdrr.io/cran/BMA/man/bic.glm.html

BMA for survival models: https://rdrr.io/cran/BMA/man/bic.surv.html

 

Sample code in R:

# Install and then load the BMA package:

http://www.r-bloggers.com/installing-r-packages/

# Then, import your data to R:

> mydata <- as.matrix(read.csv(file="c:/forbma.csv", sep=",", header=TRUE))

# view first row of data

> mydata[1,]

# outcome y is in column 1

> y<-mydata[,1]

# covariates start in column 2

> x<-mydata[,-1]

# Run BMA

> bayes1<-bicreg(x,y,maxCol=45)

# the highlighted predictors p14, p37, p51, and p56 show high posterior probabilities

> bayes1

Call:

bicreg(x = x, y = y, maxCol = 45)

Posterior probabilities(%):

p1 p4 p5 p6 p11 p14 p17 p19 p20 p21 p23 p24 p33 p34 p37 p38

2.1 6.8 1.2 1.8 1.3 99.7 0.4 0.4 0.4 1.2 4.6 2.9 8.9 8.9 92.5 0.4

p39 p40 p41 p42 p45 p47 p50 p51 p54 p56 p57 p58 p59 p60 p61 p62

2.2 22.6 1.4 10.3 0.4 10.7 1.1 96.1 0.4 85.4 0.4 5.9 0.5 1.1 14.5 0.7

p63 p64 p65 p67 p68 p70 p74 p75 p76 p82 p83 p84

0.4 1.4 3.1 0.4 1.2 0.4 7.1 0.4 0.4 0.4 24.0 1.4

Coefficient posterior expected values:

(Intercept) p1 p4 p5 p6 p11

1.116e+02 -1.184e-02 6.976e-01 -9.960e-03 -1.212e-02 -5.514e-02

p14 p17 p19 p20 p21 p23

-1.597e+01 4.415e-03 3.246e-02 2.110e-02 1.685e-01 -4.120e-01

p24 p33 p34 p37 p38 p39

4.674e-02 -9.165e-01 -1.060e+00 -4.857e+00 -8.269e-04 -1.719e-02

p40 p41 p42 p45 p47 p50

2.270e+00 9.302e-02 1.156e-01 1.688e-03 -2.086e-01 7.333e-02

p51 p54 p56 p57 p58 p59

-1.332e+01 3.973e-03 -2.659e+00 -4.577e-03 -6.717e-01 3.552e-03

p60 p61 p62 p63 p64 p65

-1.496e-02 1.156e+00 -1.122e-02 -1.130e-02 -5.116e-02 7.178e-03

p67 p68 p70 p74 p75 p76

1.111e-02 -9.653e-02 2.090e-03 1.059e-01 -9.618e-03 -7.496e-03

p82 p83 p84

-8.890e-04 3.315e+00 8.275e-02

# a summary of the results shows the best 5 models based on the posterior probability. Note that p14, p37, p51, and p56 are included in all 5 models:

> summary(bayes2)

Call:

bicreg(x = x, y = y, maxCol = 45)

192 models were selected

Best 5 models (cumulative posterior probability = 0.1362 ):

p!=0 EV SD model 1 model 2 model 3 model 4 model 5

Intercept 100.0 1.116e+02 20.79991 111.419 109.391 109.437 114.331 112.989

  • 2.1 -1.184e-02 0.10679 . . . . .
  • 6.8 6.976e-01 3.13531 . . . . .
  • 1.2 -9.960e-03 0.12699 . . . . .
  • 1.8 -1.212e-02 0.11770 . . . . .

p11 1.3 -5.514e-02 0.73893 . . . . .

p14 99.7 -1.597e+01 5.09086 -16.232 -15.946 -15.313 -16.199 -15.517

p17 0.4 4.415e-03 0.53952 . . . . .

p19 0.4 3.246e-02 1.14039 . . . . .

p20 0.4 2.110e-02 1.05256 . . . . .

p21 1.2 1.685e-01 2.08397 . . . . .

p23 4.6 -4.120e-01 2.30079 . . . . .

p24 2.9 4.674e-02 0.31993 . . . . .

p33 8.9 -9.165e-01 3.45300 . . . . .

p34 8.9 -1.060e+00 4.02255 . . . . .

p37 92.5 -4.857e+00 2.23499 -5.268 -5.167 -5.574 -5.075 -5.794

p38 0.4 -8.269e-04 0.04108 . . . . .

p39 2.2 -1.719e-02 0.14495 . . . . .

p40 22.6 2.270e+00 4.84125 . . 9.219 . .

p41 1.4 9.302e-02 1.08067 . . . . .

p42 10.3 1.156e-01 0.39787 . . . . 1.072

p45 0.4 1.688e-03 0.04960 . . . . .

p47 10.7 -2.086e-01 0.72150 . . . . .

p50 1.1 7.333e-02 0.99301 . . . . .

p51 96.1 -1.332e+01 5.34423 -13.603 -14.738 -13.474 -14.760 -14.414

p54 0.4 3.973e-03 0.28198 . . . . .

p56 85.4 -2.659e+00 1.53376 -3.078 -3.171 -3.040 -3.350 -3.153

p57 0.4 -4.577e-03 0.15627 . . . . .

p58 5.9 -6.717e-01 3.13350 . . . . .

p59 0.5 3.552e-03 0.09560 . . . . .

p60 1.1 -1.496e-02 0.22703 . . . . .

p61 14.5 1.156e+00 3.24942 . . . 7.431 .

p62 0.7 -1.122e-02 0.24509 . . . . .

p63 0.4 -1.130e-02 0.40210 . . . . .

p64 1.4 -5.116e-02 0.62260 . . . . .

p65 3.1 7.178e-03 0.05077 . . . . .

p67 0.4 1.111e-02 0.40138 . . . . .

p68 1.2 -9.653e-02 1.31242 . . . . .

p70 0.4 2.090e-03 0.30554 . . . . .

p74 7.1 1.059e-01 0.45815 . . . . .

p75 0.4 -9.618e-03 0.34203 . . . . .

p76 0.4 -7.496e-03 0.32040 . . . . .

p82 0.4 -8.890e-04 0.20106 . . . . .

p83 24.0 3.315e+00 7.10008 . 11.023 . . .

p84 1.4 8.275e-02 0.92978 . . . . .

nVar 4 5 5 5 5

  • 0.164 0.182 0.181 0.180 0.178

BIC -10.857 -9.447 -9.382 -9.081 -8.749

post prob 0.050 0.025 0.024 0.021 0.017