Want to do a random act of kindness? Share this post.

In this blog post, I elaborate on setting axis limits in a plot, generated by ggplot2. There are two ways: one where you pretend the data outside the limits doesn’t exist (using lims), and one where you respect that the data outside the limits exists (using coord_cartesian).

The documentation for the lims, xlim and ylim functions state the following about values outside its limits:

This is a shortcut for supplying the limits argument to the individual scales. Note that, by default, any values outside the limits will be replaced with NA.

And this is what the documentation says about coord_cartesian:

Setting limits on the coordinate system will zoom the plot (like you’re looking at it with a magnifying glass), and will not change the underlying data like setting limits on a scale will.

Hadley Wickham, one of the most important figures in the R community, wrote about it in his book:

Here’s an example. First, we create some dummy data, an X and a Y that are closely correlated. We also add some outliers to the data. Lastly, we plot it, without setting any limits on the axes.

This is what the data looks like. Two strongly correlated series, when X is smaller than 7. And on the right you can see the outliers. I also added a linear smoother to demonstrate my point later on. What we see:

All the data is visible, even the outliers.

This smoother is based on all the data, even the outliers.

We can limit our X and Y axes using the xlim and ylim function as follows.

However, we respect that outliers exist and the smoother is based on all the data.

Great success!

Say thanks, ask questions or give feedback

Technologies get updated, syntax changes and honestly… I make mistakes too. If something is incorrect, incomplete or doesn’t work, let me know in the comments below and help thousands of visitors.

Want to do a random act of kindness? Share this post.