**Oftentimes, it happens I need to calculate the difference of something between two periods. If each row represents a period, the fastest thing to do is to ***lag* the variable you need to perform calculations on.

In R, the *lag()* function from the stats package is an option, but you’ll notice it won’t work in data frames.

There’s a manual, not so elegant solution. For example, I can create a lagged variable with an offset of one simply by adding an NA in front of a vector and removing the last item. The leading variable with an offset of one is completely analogous.

```
library(data.table)
dt <- data.table(base = seq(1,10,1))
dt$lagged_manual <- c(NA,dt$base[1:(nrow(dt)-1)])
dt$leading_manual <- c(dt$base[2:nrow(dt)],NA)
```

See, not so elegant. However, the *data.table* package offers some really nice functionalities to create leading and lagged variables. The *shift()* function provides lagging/leading capabilities with an easy to use interface.

```
dt[,lagged_base := shift(base, 1, type = 'lag')]
dt[,leading_base := shift(base, 1, type = 'lead')]
```

By the way, if you’re having trouble understanding some of the code and concepts, I can highly recommend “An Introduction to Statistical Learning: with Applications in R”, which is the must-have data science bible. If you simply need an introduction into R, and less into the Data Science part, I can absolutely recommend this book by Richard Cotton. Hope it helps!

Great success!

### Say thanks, ask questions or give feedback

**Technologies get updated, syntax changes and honestly… I make mistakes too. If something is incorrect, incomplete or doesn’t work, let me know in the comments below and help thousands of visitors.**

Pingback: facebook comments blog seo