Skip to content
Home » Calculate cumulative sum (cumsum) by group in R

Calculate cumulative sum (cumsum) by group in R

In this blog post, I tackle a question that you recurringly see on a lot of boards. We are going to calculate the cumulative sum, but within a group that the rows belong to.

Thanks to vegetableagony for pointing out that, depending on the size of the dataset, other conclusions can be drawn. That’s why edited this post and benchmarked several solutions on a small and large dataset.

Small datasets

First, let’s create the data and load the packages we’ll be using throughout this post.

library(dplyr)
library(data.table)

data('mtcars')

dt <- data.table(mtcars)
df <- copy(mtcars)

We will try to get the cumulative sum of the column hp, grouped by cyl and gear. (I know, it makes no sense)

First, let’s try to do it with only base functions. The underused function ave will return the cumulative sum by group in a single vector. So you’ll have to concatenate it to your data frame. The median time is 425 microseconds.

with(
  df[order(df$hp),],
  ave(hp, cyl, gear, FUN = cumsum)
)

Next, we’ll use ddply from the plyr package. This will return the data frame with the cumulative sum by group as a new column of the data frame. When we test this, we can clearly see that this is ten times slower: 4500 microseconds.

ddply(
  df[order(df$hp),],
  .(cyl, gear), 
  .fun = transform, 
  cumulative_sum = (cumsum(hp))
)

Next up, dplyr. The following chain of functions will return a tibble with the cumulative sum by group as a new column. Not as fast as the first solution, but still a lot faster than ddply(): 830 microseconds.

df %>% 
	arrange(hp) %>% 
	group_by(hp) %>% 
	mutate(cumulative_sum = cumsum(hp))

Finally, let’s go with data.table. I propose two solutions. The first one returns the cumulative sum by group and the columns it was grouped by. The second column adds the cumulative sum by group as a new column to the data frame. Both solutions are somewhat slow (2200 microseconds), which isn’t what we expect from data.table.

dt[order(hp)][,.(cumulative_sum = cumsum(hp)), by = .(cyl, gear)]
dt[order(hp)][,cumulative_sum := cumsum(hp), by = .(cyl, gear)]

Large datasets

Let’s load the flights dataset.

library(nycflights13)

dt <- data.table(flights)
df <- copy(flights)

Let’s try the ave solution first. Given the size of the dataset, this is a lot slower. A median time of 1160 microseconds.

with(
  df[order(df$dep_delay),],
  ave(dep_delay, carrier, year, FUN = cumsum)
)

So, let’s go with the ddply solution now. Compared to ave, it’s five to six stimes slower: 6170 microseconds. For small datasets, it was a factor 10, now it’s only factor 6, so that’s an improvement, I guess? 🙂

ddply(
  df[order(df$dep_delay),],
  .(carrier, year), 
  .fun = transform, 
  cumulative_sum = (cumsum(dep_delay))
)

Next up is dplyr. For the small dataset, it took double the time to generate the result. Now, it’s a little faster than the ave solution: 980 microseconds.

df %>% 
	arrange(dep_delay) %>% 
	group_by(carrier, year) %>% 
	mutate(cumulative_sum = cumsum(dep_delay))

Finally, I’m going for data.table again. Surprisingly, it gets the job done in 600 microseconds. This is what we expect, given the advertised speed of data.table.

dt[order(dep_delay)][,.(cumulative_sum = cumsum(dep_delay)), by = .(carrier, year)]
dt[order(dep_delay)][,cumulative_sum := cumsum(dep_delay), by = .(carrier, year)]

Conclusion

Conclusion, for small datasets, the base function ave is the fastest solution. As you can see, sometimes you have to kill your darlings. Yet for big datasets, nothing beats data.table.

By the way, if you’re having trouble understanding some of the code and concepts, I can highly recommend “An Introduction to Statistical Learning: with Applications in R”, which is the must-have data science bible. If you simply need an introduction into R, and less into the Data Science part, I can absolutely recommend this book by Richard Cotton. Hope it helps!

Great succes!

Say thanks, ask questions or give feedback

Technologies get updated, syntax changes and honestly… I make mistakes too. If something is incorrect, incomplete or doesn’t work, let me know in the comments below and help thousands of visitors.

72 thoughts on “Calculate cumulative sum (cumsum) by group in R”

  1. Hi there. I mark you’ve noticed that a lot of unlike devices from been invented for cigarette smoking. Iqos vape No matter what they also came up with something interesting for cannabis smoking – it’s bong In my opinion story of the finest ways to fully encounter the quality of cannabis and its impact on you. If you are a beginner and a moment ago want to try smoking.

  2. Most cbd products secure a unqualified impact on focal point and they do not straight soothe the longing and urgency as uncountable people force think. A scarcely any how to use cbd oil for erectile dysfunction
    clothed a undeniably beneficial power on the wisdom in the works and since I smoke them every heyday I’ve noticed that my productivity has increased and my inclusive well-being has improved. I propose b assess if you’re willing to manipulate the unchanged result – you categorically bear to seek it.

  3. Hello, i think that i saw you visited my website so i came to “return the favor”.I’m trying to find things to enhance my
    site!I suppose its ok to use some of your ideas!!

  4. Does your site have a contact page? I’m having a
    tough time locating it but, I’d like to send you an email.
    I’ve got some ideas for your blog you might be
    interested in hearing. Either way, great website and I look forward to seeing it develop over time.

  5. I haѵe been surfing online mⲟre than tһree houгs toɗay, yet I never fοund any intеresting article ⅼike
    yⲟurs. It іs pretty worth enough foг me.

    Personally, iff aⅼl web owners and bloggers mɑde good сontent as
    you did, the internet will bе a lоt more usefuⅼ
    thɑn eᴠeг beforе.

    My web pɑge – Fammous Swimmers, http://Cityryde.Com/,

  6. Everyone loves what you guys are usually up too.
    This kind of clever work and exposure! Keep up the terrific works guys I’ve incorporated you
    guys to my personal blogroll.

  7. Thank you, I have just been searching for information approximately this
    topic for ages and yours is the greatest I have came upon so far.

    But, what in regards to the conclusion? Are you positive in regards to
    the supply?

  8. I was curious if you ever considered changing the layout of your
    website? Its very well written; I love what youve got to say.
    But maybe you could a little more in the way of
    content so people could connect with it better. Youve got an awful lot of text for only having
    one or 2 pictures. Maybe you could space it out
    better?

  9. OMG! This is amazing. Ireally appreciate it~ May I tell
    you my secret ways on a secret only I KNOW and if you want
    to really findout? You really have to believe mme and have faith and I
    will show how to change your life Once again I
    want to show my appreciation and may all the blessing goes to you now!.

  10. I do accept as true with all of the ideas you’ve introduced in your post.

    They are very convincing and can certainly work. Nonetheless, the posts are
    very quick for starters. May you please prolong them a little from next time?
    Thanks for the post.

  11. Howdy are using WordPress for your blog platform?
    I’m new to the blog world but I’m trying to get started and set up my
    own. Do you need any html coding knowledge to make
    your own blog? Any help would be greatly appreciated!

  12. Just wish to say your article is as astonishing.
    The clarity in your post is simply excellent and
    i can assume you are an expert on this subject. Well with your permission allow me to grab your
    RSS feed to keep up to date with forthcoming post. Thanks a
    million and please continue the gratifying work.

  13. Heya are using WordPress for your site platform?
    I’m new to the blog world but I’m trying to get started
    and create my own. Do you require any coding expertise to make
    your own blog? Any help would be really appreciated!

  14. We stumbled over here coming from a different page and thought I might
    as well check things out. I like what I see so
    i am just following you. Look forward to finding out about your web page yet again.

  15. We stumbled over here coming from a different page and thought I
    may as well check things out. I like what I see so now i’m following you.
    Look forward to finding out about your web page repeatedly.

  16. Elektrikçi Ankara | Ankara Elektrikçi arıyorsanız doğru yerdesiniz. Elektrikçi Ankara hiç çekinmeden tüm elektrik sorunlarını arayabilir bilgi alabilirsiniz. Elektrikci lazım olduğunda Ankara ‘nın bir çok mahallesine hizmet veren Elektrikçi ustalarımızdan profesyonel destek alabilirsiniz.

  17. I believe people who wrote this needs true loving because it’s a blessing.
    So let me give back and show my inside to change your life and if you want
    to with no joke truthfully see I will share info about how to find hot girls for free
    Don’t forget.. I am always here for yall. Bless yall!

  18. Hmm it seems like your site ate my first comment (it was extremely long)
    so I guess I’ll just sum it up what I submitted and say, I’m thoroughly enjoying
    your blog. I too am an aspiring blog blogger but I’m
    still new to the whole thing. Do you have any tips and hints for beginner blog writers?
    I’d definitely appreciate it.

  19. I will right away take hold of your rss feed as I can’t in finding your e-mail subscription hyperlink or e-newsletter service.
    Do you have any? Please allow me recognize so that I may subscribe.
    Thanks.

Leave a Reply

Your email address will not be published.