Set Significant Letters in the Bar Plot in ggplot2

Liang / 2018-05-04


Problem

As a biologist, one daily work is comparing the treatment and untreatment group and find if there are any significant difference. One common way is drawing a barplot with error bar and put the significant letters a little higher than the bar plot. Like a figure showing below: annotations for significant differences based on p value.

Results are usually drawn in a barplot. For all variables with the same letter, the difference between the means is not statistically significant. If two variables have different letters, they are significantly different.

But draw this graph in ggplot are not easy. This post will solve this problem.

Solution

prepare data

x<-c(25.6,22.2,28.0,29.8,24.4,30.0,29.0,27.5,25.0,27.7,23.0,32.2,28.8,28.0,31.5,25.9,20.6,21.2,22.0,21.2) 
b<-data.frame(x,a=gl(5,4,20)) ## a is factor

head(b)
##      x a
## 1 25.6 1
## 2 22.2 1
## 3 28.0 1
## 4 29.8 1
## 5 24.4 2
## 6 30.0 2

anova test and Tukey multiple comparisons

m1<-aov(x~a,data=b)
TukeyHSD(m1)
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = x ~ a, data = b)
## 
## $a
##       diff        lwr        upr     p adj
## 2-1  1.325  -4.718582  7.3685818 0.9584566
## 3-1  0.575  -5.468582  6.6185818 0.9981815
## 4-1  2.150  -3.893582  8.1935818 0.8046644
## 5-1 -5.150 -11.193582  0.8935818 0.1140537
## 3-2 -0.750  -6.793582  5.2935818 0.9949181
## 4-2  0.825  -5.218582  6.8685818 0.9926905
## 5-2 -6.475 -12.518582 -0.4314182 0.0330240
## 4-3  1.575  -4.468582  7.6185818 0.9251337
## 5-3 -5.725 -11.768582  0.3185818 0.0675152
## 5-4 -7.300 -13.343582 -1.2564182 0.0146983

TukeyHSD means Tukey’s ‘Honest Significant Difference’ method which create a set of confidence intervals on the differences between the means of the levels of a factor with the specified family-wise probability of coverage. From the result, only 5 and 2, 5 and 4 have significantly different.

HSD.test mutiple comparisons

library(agricolae)
results <- HSD.test(m1, "a", group=TRUE)
results$groups
##        x groups
## 4 28.550      a
## 2 27.725      a
## 3 26.975     ab
## 1 26.400     ab
## 5 21.250      b
oder.group <- results$groups[order(rownames(results$groups)),]

Using HSD.test function, we can assign our group to significant letters, the default significant level is 0.05.

Creat barplot with error bar and significant letters.

library(Rmisc)
## Loading required package: lattice
## Loading required package: plyr
## -------------------------------------------------------------------------
## You have loaded plyr after dplyr - this is likely to cause problems.
## If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
## library(plyr); library(dplyr)
## -------------------------------------------------------------------------
## 
## Attaching package: 'plyr'
## The following objects are masked from 'package:dplyr':
## 
##     arrange, count, desc, failwith, id, mutate, rename, summarise,
##     summarize
## The following object is masked from 'package:purrr':
## 
##     compact
stats <- summarySE(b, measurevar="x", groupvars=c("a"),na.rm=TRUE)
stats
##   a N      x        sd        se        ci
## 1 1 4 26.400 3.2863353 1.6431677 5.2292929
## 2 2 4 27.725 2.4431878 1.2215939 3.8876570
## 3 3 4 26.975 3.9802638 1.9901319 6.3334879
## 4 4 4 28.550 2.3158872 1.1579436 3.6850933
## 5 5 4 21.250 0.5744563 0.2872281 0.9140881
library(ggplot2)
ggplot(stats, aes(x=a, y=x)) + 
  geom_bar(position=position_dodge(.5), stat="identity",
             colour="black", width = 0.5) +      # Change bar size with width = 0.5
    geom_errorbar(aes(ymin=x-se, ymax=x+se),
                  size=.5,    # Thinner lines
                  width=.2,
                  position=position_dodge(.5)) + 
  theme_classic() + geom_text(aes(x=a, y=x+se+2,label=oder.group$groups), position=position_dodge(width=0.9), size=4)#text(x = stats$a, y = stats$x+stats$se+5, labels = c("ab", "a", "ab", "a", "b"))

For the graph above, we know only 5 have significant difference with 2 and 4.

Last modified on 2018-05-04