Home ยป Calculating shares by row in R using data.table

Calculating shares by row in R using data.table

  • by
Want to do a random act of kindness? Share this post.

in R, it often happens that I need to calculate the share of each column by row. In this very simple example I would like to update the following table:

applesorangesbananaspineapples
2463
1092
5123

and I would like to be:

applesorangesbananaspineapples
0.130.270.40.2
0.0800.750.17
0.50.10.20.2

As you can see, every cell now contains the share its absolute value accounts for in the row. Using data.table, there is an easy way to do this.

rsums <- rowSums(fruit])
fruit <- fruit[,lapply(.SD,function(x) {x / rsums})]
rm(rsums)

In the data.table package, the .SD acronym stands for “subset of the data”. By doing a lapply over .SD, without specifying .SDcols, we are applying the function over all the columns. However, if you would only want to apply the function to apples and oranges, one would use:

rsums <- rowSums(fruit])
fruit <- fruit[,lapply(.SD,function(x) {x / rsums}), .SDCols = c('apples','oranges')]
rm(rsums)

Good luck!

Want to do a random act of kindness? Share this post.