Advertisements
I am trying to remove duplicates from my Dataframe and save their data into the columns where they are NA/Empty.
Example:
I’ve the following DATAFRAME and I would like to remove all the duplicates in column A but merge the values from the rest of the tables
A | B | C | D | E |
---|---|---|---|---|
1 | X | |||
2 | X | |||
2 | X | |||
2 | X | |||
3 | X | |||
3 | X | |||
2 | X |
The expected output:
A | B | C | D | E |
---|---|---|---|---|
1 | X | |||
2 | X | X | X | X |
3 | X | X |
How can I perform the above dynamically?
Thanks in advance for the answers
>Solution :
You can use groupby_first
because it compute the first non-null entry of each column.:
>>> df.groupby('A', as_index=False).first()
A B C D E
0 1 X None None None
1 2 X X X X
2 3 None X X None