There are many situations that you have an 1:n relationship, where you would have wanted an 1:1 relationship. Sometimes, it makes sense to concatenate the strings from rows belonging to the same group. Let’s do that in Python, using Pandas.
Imagine, you have the following table that contains an ID and a fruit variety. Each ID corresponds to a specific kind of fruit: apples (32), pears (84) and grapes (12). Let’s say you’re trying to join this table with its ID on another table.
ID | Variety |
32 | Jonagold |
32 | Golden |
32 | Granny Smith |
84 | Conference |
84 | Doyenné |
12 | Thompson Seedless |
However, you want a 1:1 relationship, so your table should look like this:
ID | Varieties |
32 | Jonagold, Golden, Granny Smith |
84 | Conference, Doyenné |
12 | Thompson Seedless |
You can see that the table is grouped by its ID and the varieties are joined and separated by a comma.
df.head() # our table that contains the fruit df['Variety'] = df.groupby(['ID'])['Variety'].transform(lambda x: ', '.join(x))
To achieve this, we group the DataFrame by the ID, and select the Variety column. We transform by applying a lambda function on all the rows. This lambda function concatenates all the values within the group, separated by a comma and space.
You now have the following table.
ID | Varieties |
32 | Jonagold, Golden, Granny Smith |
32 | Jonagold, Golden, Granny Smith |
32 | Jonagold, Golden, Granny Smith |
84 | Conference, Doyenné |
84 | Conference, Doyenné |
12 | Thompson Seedless |
Finally, deduplicate the table, and reset the index.
df = df.drop_duplicates().reset_index(drop = True)
Congratulations, you have concatenated strings from multiple rows by category.
Thanks,
For one week I was looking to find a technique to combine my records based on their Id without missing information.
I used groupbu but unfortunately I missed lots of parts of my notes.
fortunately, I found your page.
Thank you for your sharing. I am worried that I lack creative ideas. It is your article that makes me full of hope. Thank you. But, I have a question, can you help me?