Skip to content
Home ยป How to flatten MultiIndex columns in Pandas

How to flatten MultiIndex columns in Pandas

  • by
  • 2 min read

There are very few occasions when you really need a hierarchical column index in Pandas. Mostly, you just want to identify a column with a single column name. In this blog post, we’ll learn how to properly flatten a DataFrame with MultiIndex columns.

One of the things that annoys me a lot in Pandas, is how it returns MultiIndex columns, after using agg() when you apply multiple functions to one column. To get rid of the MultiIndex, we need to take two steps.

๐Ÿ™Œ Side note: make sure you have Pandas >= 0.24.

Two steps to flatten MultiIndex columns

โ˜ Step 1: flatten the index

Using the to_flat_index function, we can make sure that all columns contain all levels of the index.

df.columns = df.columns.to_flat_index()

โœŒ Step 2: concatenate all levels of the index

After you have flattened the index, you can apply this list comprehension over the index. For each column, it will join the different levels of the index as one string, with a separating _ in between.

df.columns = ['_'.join(column) for column in df.columns]

Or put it all together

If you want to apply the two steps in one go, you can simply do this:

df.columns = ['_'.join(column) for column in df.columns.to_flat_index()]

Good job!

Say thanks, ask questions or give feedback

Technologies get updated, syntax changes and honestly… I make mistakes too. If something is incorrect, incomplete or doesn’t work, let me know in the comments below and help thousands of visitors.

Leave a Reply

Your email address will not be published. Required fields are marked *