3  Bayes Factor ANOVA

3.1 About this chapter

  1. Questions
    • How can I do an ANOVA?
  2. Objectives
    • Understand multiplicity is not a problem for Bayes Factors
  3. Keypoints
    • The package simplebf automates Bayes Factor \(t\)-tests for many samples

3.2 The issue of multiplicity in Frequentism and Bayesianism

The ANOVA is often seen to be a catch-all test that can be used for an experiment that has more than two samples in it. Experimenters often understand this to be true on the basis that ‘you shouldn’t do \(t\)-tests for more than two samples by repeating the \(t\)-test’. This is quite true and is a strategy for avoidance of the problem of multiplicity.

Multiplicity or multiple testing occurs when we do lots of tests one after the other, in a batch. The more we do, the more likely we are to make an error in our conclusions (not in our working). This happens in Frequentist statistical tests because the \(p\)-value expresses a fixed error rate that we are happy to accept.

Recall that the \(t\)-test has two hypotheses (of which we test just one)

\(H_0 : \bar{x_1} - \bar{x_2} = 0\)

\(H_1 : \bar{x_1} - \bar{x_2} \neq 0\)

and we set a level at which would reject \(H_0\) usually \(p < 0.05\). The \(p\) reflects the proportion of times that the difference observed is seen in the null model by chance (so we see the difference 1 in 20 times by chance), in other words in a proportion of 0.95 of times we would reject the null correctly. Which is fine for just one comparison.

If we do more than one test we must multiply these probabilities together, giving \(0.95 * 0.95 = 0.9025\). This is catastrophic, by doing just two tests we reduce the proportion of times we choose the correct hypothesis to 0.9025, down from 19/20 to 18/20, we make twice as many mistakes! For more tests this gets worse.

Frequentist statistics have lots of corrections for this sort of problem and the ANOVA post-hoc tests are in part a way of doing that. The good news for those using Bayes Factors is that this problem does not exist. Because we don’t have a fixed error rate, it doesn’t get bigger when we do more tests. We are free to do as many hypothesis comparisons as we wish.

3.3 Automating BayesFactor::ttest() for many comparisions

As there isn’t a need for a Bayes Factor analogue to the ANOVA and post-hoc tests, we can just use the \(t\)-test analogue over and over again. If we have a multiple sample dataset we just need a book-keeping method to pull out the samples of interest.

Let’s draft one with dplyr and the Plant Growth data set.

library(dplyr)
library(BayesFactor)

small_df <- PlantGrowth %>% 
    filter(group %in% c("ctrl", "trt1")) %>% 
    droplevels()
  
ttestBF(formula = weight ~ group, data = small_df)

This pattern helps you extract the pairs of samples you need, though you would need to repeat it every time you wanted to analyse a new pair. A convenience function for the simple case that allows us to do BayesFactor::ttestBF() for all pairs in a specified column in a dataframe exists in the package simplebf. It works like this:

library(simplebf)
result <- allpairs_ttestbf(PlantGrowth, 
                           group_col = "group", data_col = "weight", 
                           rscale = "medium", 
                           h_1 = "test_greater_than_control")

knitr::kable(result, digits = 4)
control_group test_group h_0 h_1 BayesFactor odds_h_1 summary
trt1 ctrl ctrl equal to trt1 ctrl greater than trt1 1.0834 1:1.0834 Anecdotal evidence for H_1 compared to H_0
trt2 ctrl ctrl equal to trt2 ctrl greater than trt2 0.1622 1:0.1622 Substantial evidence for H_0 compared to H_1
ctrl trt1 trt1 equal to ctrl trt1 greater than ctrl 0.2167 1:0.2167 Substantial evidence for H_0 compared to H_1
trt2 trt1 trt1 equal to trt2 trt1 greater than trt2 0.1363 1:0.1363 Substantial evidence for H_0 compared to H_1
ctrl trt2 trt2 equal to ctrl trt2 greater than ctrl 3.3872 1:3.3872 Substantial evidence for H_1 compared to H_0
trt1 trt2 trt2 equal to trt1 trt2 greater than trt1 12.6445 1:12.6445 Strong evidence for H_1 compared to H_0

The results are pretty easy to read. Note we can set rscale values as in the ttestBF() and we can choose one of three values for \(H_1\) test_greater_than_control, test_less_than_control and test_not_equal_to_control.

Roundup
  • Bayes Factors do not need multiple hypothesis corrections
  • simplebf is a package for automating the comparison of all groups in a single variable in a tidy dataframe