Plotting bars in ggplot2 is easy. Yet, in many cases, you want to order these bars according to their frequency (count) or according to any other numeric value. In this blog post, I show you three ways to achieve this.
First, let’s load the libraries and create the titanic data set.
library(titanic) library(data.table) library(dplyr) library(forcats) library(ggplot2) library(ggthemes) df <- data.table(titanic::titanic_train) df[,Pclass := as.factor(Pclass)]
The first solution is the dplyr way. Group the data frame and summarise the count and pass it to the ggplot function. In your aesthetics, you can use the reorder function to order the bars on their frequency.
df %>% group_by(Pclass) %>% summarise(count = n()) %>% ggplot(aes(x = reorder(Pclass,(-count)), y = count)) + geom_bar(stat = 'identity') + theme_clean()
The second solution is the data.table way. It’s the same line of thought but the syntax is more concise.
ggplot(df[,.(count = .N), by = Pclass],aes(x = reorder(Pclass,(-count)), y = count)) + geom_bar(stat = 'identity') + theme_clean()
The third solution uses forcats. It’s another Tidyverse package to work with categorical variables. It offers a really handy function: fct_infreq() reorders according to a factor value’s frequency.
ggplot(df,aes(x = fct_infreq(Pclass))) + geom_bar(stat = 'count') + theme_clean()
By the way, if you’re having trouble understanding some of the code and concepts, I can highly recommend “An Introduction to Statistical Learning: with Applications in R”, which is the must-have data science bible. If you simply need an introduction into R, and less into the Data Science part, I can absolutely recommend this book by Richard Cotton. Hope it helps!