Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to remove rows with duplicate values in some columns, while keeping values that are different?

I have a pandas dataframe in the following format.

| id | name    | last_name | address | x       | y | x_list    | y_list|
| -- | ------- | --------- | ------- | ------- | - | --------- | ----- |
| 1  | 'John'  | 'Smith'   | 'add_1' | 'one'   | 1 | ['one']   | [1]   |
| 2  | 'Tom'   | 'Davis'   | 'add_2' | 'two'   | 2 | ['two']   | [2]   |
| 3  | 'John'  | 'Smith'   | 'add_1' | 'three' | 3 | ['three'] | [3]   |
| 4  | 'Tom'   | 'Davis'   | 'add_2' | 'four'  | 4 | ['four']  | [4]   |
| 5  | 'Susan' | 'Jones'   | 'add_1' | 'one'   | 1 | ['one']   | [1]   |

I have no idea how to approach this problem. I need this output:

| id | name    | last_name  | address | x_list           | y_list |
| -- | ------- | ---------- | ------- | ---------------- | ------ |
| 1  | 'John'  | 'Smith'    | 'add_1' | ['one', 'three'] | [1, 3] |
| 2  | 'Tom'   | 'Davis'    | 'add_2' | ['two', 'four']  | [2, 4] |
| 3  | 'Susan' | 'Jones'    | 'add_1' | ['one']          | [1]    |

Basically, I need to return a new DataFrame, or modify the existing one so the columns with the same name, last_name, and address have their x_list and y_list merged. Can anyone help me how to do this in pandas? This needs to be done on a dataframe of about 58 000 rows.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

then use following code:

df.groupby(['name', 'last_name', 'address'])[['x_list', 'y_list']].sum().reset_index()

output:

    name    last_name   address x_list          y_list
0   John    Smith       add_1   [ one , three ] [ 1 , 3 ]
1   Susan   Jones       add_1   [ one ]         [ 1 ]
2   Tom     Davis       add_2   [ two , four ]  [ 2 , 4 ]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading