Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Map A Dictionary With Pandas Column

I have a pandas dataframe column province which contains USA states and Canada province names in both uppercase and lowercase

df.province

Output

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

MANITOBA
Alberta
CA
OH

A dictionary to map all Canada province values

ca_provinces_dic = {
    'Newfoundland and Labrador': 'NL',
    'Prince Edward Island': 'PE',
    'Nova Scotia': 'NS',
    'New Brunswick': 'NB',
    'Quebec': 'QC',
    'Ontario': 'ON',
    'Manitoba': 'MB',
    'Saskatchewan': 'SK',
    'Alberta': 'AB',
    'British Columbia': 'BC',
    'Yukon': 'YT',
    'Northwest Territories': 'NT',
    'Nunavut': 'NU',
}

How to map all the province values with their respective short form using pandas that handles the case matching issue as well?

For example –
I want to compare all the province column values (in lowercase) with all the dictionary keys (in lowercase) and based on the match I will apply the appropriate short form for the province column values (only where applicable)


My current attempt to this problem.

canada_provinces_dic = {
    'Newfoundland and Labrador': 'NL',
    'Prince Edward Island': 'PE',
    'Nova Scotia': 'NS',
    'New Brunswick': 'NB',
    'Quebec': 'QC',
    'Ontario': 'ON',
    'Manitoba': 'MB',
    'Saskatchewan': 'SK',
    'Alberta': 'AB',
    'British Columbia': 'BC',
    'Yukon': 'YT',
    'Northwest Territories': 'NT',
    'Nunavut': 'NU',
}


def handle_state(data_attr):
  for k, v in canada_provinces_dic.items():
      if data_attr.strip().lower() == k.lower():
        return canada_provinces_dic[k]
  else:
    return data_attr
df['state'] =  df.state.apply(handle_state)

>Solution :

Use Series.map and Series.fillna:

First convert your dict to keys in lowercase:

In [638]: ca_provinces_dic = {k.lower():v for k,v in ca_provinces_dic.items()}

In [659]: df['province'] = df['province'].str.lower().map(ca_province_dict).fillna(df['province'])

In [660]: df
Out[660]: 
  province
0       MB
1       AB
2       CA
3       OH
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading