Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Creating new column with length of another column with DataFrames

I have a DataFrame containing average user ratings, languages, size and the amount of user ratings. Now I’d like to create a new column with the the amount of langauges.

Printing the languages returns:

print(df.iloc[0]['Languages'])
#DA, NL, EN, FI, FR, DE, IT, JA, KO, NB, PL, PT, RU, ZH, ES, SV, ZH

Then create a new column ‘Languages Count’

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

for index, row in df.iterrows():
    row['Languages Count'] = len(row['Languages'].split(','))

Now looking at Languages Count they all are 1. Now I’m not entirely sure why this doesn’t work. I was expecting the amount of languages for each row. So for the first one 13. The second one only has 2 languages so I’d expect 2

>Solution :

You can count commas with add 1 for count number of values:

df['Languages Count'] = df['Languages'].str.count(',').add(1)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading