Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python: pandas groupby two columns, without merging them

My dataframe looks like this:

| col1 | col2 | col3 |
| ---- | ---- | ---- |
|  1   | abc  | txt1 |
|  1   | abc  | txt2 |
|  2   | abc  | txt3 |
|  1   | xyz  | txt4 |
|  2   | xyz  | txt5 |

I want to merge the text in col3 between rows only if the rows have the same value in col1 AND the rows have same value in col2.

Expected result:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

| col1 | col2 | col3       |
| ---- | ---- | ---------- |
|  1   | abc  | txt1, txt2 |
|  2   | abc  | txt3       |
|  1   | xyz  | txt4       |
|  2   | xyz  | txt5       |

I have used this:

df = df.groupby([df[col1], df[col2]]).aggregate({'col3': ', '.join})

Which joins the col3 correctly, but it also merges col1 and col2 into one column (list). How can I achieve the expected result while keeping 3 separate columns (col1, col2, col3)?

>Solution :

A possible solution, which:

  • Performs a group-by operation using two columns, col1 and col2, as the grouping keys.

  • It then aggregates the values in col3 for each group by applying a lambda function that concatenates the values into a single string, with each value separated by a comma.

(df.groupby(['col1', 'col2'], as_index=False)
 .agg({'col3': lambda x: ', '. join(x)}))

Output:

   col1 col2        col3
0     1  abc  txt1, txt2
1     1  xyz        txt4
2     2  abc        txt3
3     2  xyz        txt5
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading