Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

(Python) Transform dataframe: parse rows if more than one value is given and add to corresponding given rows of dataframe

Sample dataset I have looks like this:

Language Count
Russian 1000
English 1500
Spanish 500
Arabic,Russian, English, Spanish 2
Arabic, English 15

I want it to transform so that the result looks like this:

Language Count
Russian 1002
English 1517
Spanish 502
Arabic 17

So what happened is that, I parsed rows that contained more than one language. Added up them to languages that were already given. If it was not given (in this case: Arabic) created the new one.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

How can i achieve this?
Thank you!

>Solution :

Use DataFrame.assign with Series.str.split, DataFrame.explode and last aggregate sum:

df = (df.assign(Language=df.Language.str.split(','))
        .explode('Language')
        .groupby('Language', as_index=False, sort=False)
        .sum())
print (df)
  Language  Count
0  Russian   1002
1  English   1517
2  Spanish    502
3   Arabic     17
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading