Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to merge two rows with the same value in a given column

Hey I have a DataFrame like this:

d =  {"KEY": ["KEY2", "KEY2"], "String value": ["value 1", "value 2"], "list value": [["val1"], ["val2"]]}
df = pd.DataFrame(d)
df

In KEY column there is the same value in both rows. What I want to do is to create one row from this two rows (or more in my DataFrame) in such a way that values from from a given columns (except KEY and possibly one more column) are added. So finally I want sth like this:

d2 = {"KEY": ["KEY2"], "String value": ["value 1" + "value 2"], "list value": [["val1"] + ["val2"]]}
res = pd.DataFrame(d2)
res

How can I do that?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

groupby and sum

it is possible to process columns separately and leave some columns "unchanged" by keeping the first value, assuming it is the same for rows with the same key.

d =  {"KEY": ["KEY2", "KEY2"], "String value": ["value 1", "value 2"], "list value": [["val1"], ["val2"]],
      "other value": [1, 1]}
df = pd.DataFrame(d)


df2 = pd.DataFrame(df.groupby('KEY')['String value'].apply(lambda x:'\n'.join(x)))
df2['list value'] = df.groupby('KEY')['list value'].sum()
df2['other value'] = df.groupby('KEY')['other value'].first()
df2
    KEY     String value    list value
0   KEY2    value 1value 2  [val1, val2]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading