Home » Solve “columns overlap but no suffix specified” in Pandas

# Solve “columns overlap but no suffix specified” in Pandas

Surprisingly, the Pandas error “columns overlap but no suffix specified”, is one I ran into surprisingly late. Solving it is usually very straightforward. We’ll tackle it in this blog post.

First, let’s take a closer look at the error:

“ValueError: columns overlap but no suffix specified: Index([<list of columns>], dtype=’object’)”

When you run into this error, in all likelihood you are using the Pandas join method. What the error is telling you is that in the DataFrames you’re trying to join, there are some column names that exist in both DataFrames. Pandas is telling you to provide a suffix for the column names in both DataFrames, so you will be able to distinguish the difference in the joined DataFrame.

In the following code snippet, you can see that the columns ‘b’ and ‘c’ exist in both DataFrames and we don’t join on it. This will produce the error.

df1 = pd.DataFrame({'a': [1,0,2,3,4],'b': [0,0,0,0,0], 'c': [9,12,15,16,54]})
df1.set_index('a')
df2 = pd.DataFrame({'a': [1,0,2,3,4],'b': [5,0,3,1,0], 'c': [8,2,100,26,23]})
df2.set_index('a')
df1.join(df2, how = 'left')

Solving the error is easy. Simply provide a suffix.

df1.join(df2, how = 'left', lsuffix = '_left', rsuffix = '_right')

There is an alternative solution. Use the merge method. It will give priority to the columns of the DataFrame that you provided in the ‘how’ argument.

df1.merge(df2, how = 'left')

It depends on what you expect from joining your DataFrames but there are use cases for both solutions.

By the way, I didn’t necessarily come up with this solution myself. Although I’m grateful you’ve visited this blog post, you should know I get a lot from websites like StackOverflow and I have a lot of coding books. This one by Matt Harrison (on Pandas 1.x!) has been updated in 2020 and is an absolute primer on Pandas basics. If you want something broad, ranging from data wrangling to machine learning, try “Mastering Pandas” by Stefanie Molin.

### Say thanks, ask questions or give feedback

Technologies get updated, syntax changes and honestly… I make mistakes too. If something is incorrect, incomplete or doesn’t work, let me know in the comments below and help thousands of visitors.

A final note: I ran into this error in a very unusual situation: both DataFrames initially had no overlapping columns. However, I joined the DataFrame twices and assigned it to one of the initial DataFrames.

### Say thanks, ask questions or give feedback

Technologies get updated, syntax changes and honestly… I make mistakes too. If something is incorrect, incomplete or doesn’t work, let me know in the comments below and help thousands of visitors.