In this blog post, you’ll learn how to add confidence intervals to a line plot in R in the popular ggplot2 visualization package, part of the tidyverse.
First, let’s create some random data to work with. For demonstrational purposes, I’ve created two time series from two normally-distributed random variables. These are the steps I undertook:
- Series A will have a mean of 3 and a standard deviation of 1, series B will have a mean of 8 and a standard deviation of 2.
- I shuffle the data.
- I add an index and a lower and higher bound (fictional confidence intervals)
library(magrittr) library(dplyr) library(ggplot2) two_timeseries <- c(rnorm(50, 3,1), rnorm(50,8,2)) # 100 observations, from two normal random variables df <- tibble(series = c(rep("A", 50), rep("B",50)), data = two_timeseries) df <- df[sample(100),] # shuffle them df %<>% mutate(low = data * 0.75, high = data * 1.25, index = seq(1,100)) # add lower bounds, and an index
Add confidence intervals to a ggplot2 line plot
Next, let’s plot this data as a line, and add a ribbon (using geom_ribbon) that represents the confidence interval. By adding an alpha (opacity) you can give it a nice shaded effect.
ggplot(df, aes(x = index, y = data, group = 1)) + geom_line(col='red') + geom_ribbon(aes(ymin = low, ymax = high), alpha = 0.1)
This is the result:
Customize a ggplot2 ribbon
You should know that you can plot multiple lines and multiple confidence intervals, simply by setting multiple groups.
ggplot(df, aes(x = index, y = data, group=series)) + geom_line(col="black") + geom_point() + geom_ribbon(aes(ymin = low, ymax = high, fill = series), alpha=0.1, linetype="dashed", color="grey")
There are quite a lot of customization parameters that geom_ribbon supports. In the example above, I added a dashed border to the ribbon and set the fill color via the aesthetics.
By the way, if you’re having trouble understanding some of the code and concepts, I can highly recommend “An Introduction to Statistical Learning: with Applications in R”, which is the must-have data science bible. If you simply need an introduction into R, and less into the Data Science part, I can absolutely recommend this book by Richard Cotton. Hope it helps!