Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Substituting multiple repetitive strings in pandas dataframe with consecutive respective numeric values

I asked this question for R, but now trying to do the same in Python.

I have a dataframe with 10000 rows.

Author  Value
aaa     111
aaa     112
bbb     156
bbb     165
ccc     543
ccc     256

Each author has 4 rows, so I have 2500 authors.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I would like to substitute all strings into numeric values. Ideally with tidyverse.

Expected output

Author  Value
1       111
1       112
2       156
2       165
3       543
3       256
---------
2500    451
2500    234

Thanks!

>Solution :

Use pd.factorize():

df['Author'] = pd.factorize(df['Author'])[0] + 1
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading