**In this blog post, I elaborate on setting axis limits in a plot, generated by ggplot2. There are two ways: one where you pretend the data outside the limits doesn’t exist (using ****lims****), and one where you respect that the data outside the limits exists (using ****coord_cartesian****).**

The documentation for the lims, xlim and ylim functions state the following about values outside its limits:

This is a shortcut for supplying the

`limits`

argument to the individual scales. Note that, by default, any values outside the limits will be replaced with`NA`

.

And this is what the documentation says about coord_cartesian:

Setting limits on the coordinate system will zoom the plot (like you’re looking at it with a magnifying glass), and will not change the underlying data like setting limits on a scale will.

Hadley Wickham, one of the most important figures in the R community, wrote about it in his book:

Here’s an example. First, we create some dummy data, an X and a Y that are closely correlated. We also add some outliers to the data. Lastly, we plot it, without setting any limits on the axes.

```
library(ggplot2)
library(data.table)
set.seed(10)
normal_data_x <- rnorm(100,3,2)
normal_data_y <- normal_data_x + runif(100,-2,2)
outliers_x <- runif(25,8,10)
outliers_y <- outliers_x ^ runif(25,1,2)
d <- data.table(x = c(normal_data_x,outliers_x),y = c(normal_data_y,outliers_y))
ggplot(d,aes(x = x,y = y)) +
geom_point() +
geom_smooth(method = 'lm')
```

This is what the data looks like. Two strongly correlated series, when X is smaller than 7. And on the right you can see the outliers. I also added a linear smoother to demonstrate my point later on. What we see:

- All the data is visible, even the outliers.
- This
**smoother is based on all the data, even the outliers.**

We can limit our X and Y axes using the *xlim *and *ylim *function as follows.

```
ggplot(d,aes(x = x,y = y)) +
geom_point() +
geom_smooth(method = 'lm') +
xlim(-2,7) + ylim(-1,12)
```

We now observe:

- We no longer observe the outliers
- The
**smoother is based on the data without the outliers**.

Finally, we can limit our X and Y axes using the* coord_cartesian* function.

```
ggplot(d,aes(x = x,y = y)) +
geom_point() +
geom_smooth(method = 'lm') +
coord_cartesian(xlim=c(-2,7), ylim = c(-1,12))
```

As you can see, now:

- Once again, we no longer observe our outliers.
- However, we respect that outliers exist and the
**smoother is based on all the data**.

By the way, if you’re having trouble understanding some of the code and concepts, I can highly recommend “An Introduction to Statistical Learning: with Applications in R”, which is the must-have data science bible. If you simply need an introduction into R, and less into the Data Science part, I can absolutely recommend this book by Richard Cotton. Hope it helps!

Great success!