Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to factorize multiple dataframe columns with same values?

I have some football data with two team names and some numerical data:

ar = [
    ["browns", "patriots", 2, 5],
    ["patriots", "bills", 4, 15],
    ["browns", "bills", 1, 10],
    ["eagles", "browns", 3, 11]
]

frame = pandas.DataFrame(ar, columns=['Team1', 'Team2', 'Down', 'ToGo'])

I want to turn the team names into an enumerated type, which I can do for the first team name like:

frame['Team1'], uniques = frame['Team1'].factorize()

How can I do this to both team names at the same time, so there is consistent uniques and not have missing values? Note that 'bills' is not in Team1 column, so I can’t just apply the mapping (though I don’t know how to do that either).

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Edit: The resulting team names should look like:

   Team1  Team2
0      0      1
1      1      3
2      0      3
3      2      0

so that ‘browns’ is labeled consistently as ‘0’, in row 0 as Team1, and in row 3 as Team2.

>Solution :

import numpy as np

# get the unique values from the two columns
unique = np.unique(df[['Team1', 'Team2']])
# get the factors 
factors = np.arange(len(unique))
# map the values to the corresponding factor 
df[['Team1', 'Team2']] = df[['Team1', 'Team2']].replace(unique, factors)

>>> df

   Team1  Team2  Down  ToGo
0      1      3     2     5
1      3      0     4    15
2      1      0     1    10
3      2      1     3    11

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading