I am facing a problem using pandas on python and i can’t solve it.
I would like to merge/combine/regroup the rows which have the same url.
EDIT :
I have a dataframe looking like this :
| url | col1 | col2 | col3 | col4 |
|---|---|---|---|---|
| aaa | xx | yy | ||
| bbb | zz | |||
| aaa | ee | |||
| AA |
I would like something like this :
| url | col1 | col2 | col3 | col4 |
|---|---|---|---|---|
| aaa | ee | xx | yy | |
| bbb | zz | cc | ||
| AA |
I’ve tried using groupby, but in my df i’ve datas which don’t have URL and i want to keep them.
I’ve also tried merge with inner, which gives me pretty good results but i don’t know why it decuplates the number of rows inside my df.
thank you.
>Solution :
You can use groupby and first.
df = df.groupby('url', as_index=False).first()