Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python Replace existing column value in df based on separate lookup Code/Value (df) KeyError on columns.get_loc

** Problem **

Replace existing column value in destination data frame based on separate lookup type operations to match Code/Value column in another separate source data frame, and update, e.g., replace text in destination data frame column with Code/Value of source data. Basically replacing something like ’10’ with ‘Your Full Name’.

** Code attempt with Key Error **

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

This effort raised a Key Error.

countynames.set_index('CountyCode')
employee['County.Code'] = countynames.lookup(countynames.index, countynames['CountyCode'])

** Potential Solution Idea **

Something similar to have apply() function lookup employee[‘County.Code’] in dataframe ‘countynames’ and replace/overwite/update the existing employee[‘County.Code’] with the countynames[‘Value’].

Looking into alternative approaches, as my first attempt resulted in KeyError.

### potential approach 1: 
employee['County.Code'] = countynames.apply(lambda x: employee.loc[x['County.Code'], x['Value']], axis=1)

### potential approach 2: 
employee['County.Code']<- lapply(employee, function(x) look$class[match(x, look$CountyCode)])

** Experimental code **

employee = pd.read_csv("employee_data.csv")
countynames = pd.read_csv("County Codes.csv")
employee['County.Code']

0    34
1    34
2    34
3    34
4    55
Name: County.Code, dtype: int64

Source, lookup data frame:

countynames.head()
    CountyCode  Value
0   1   Alameda
1   2   Alpine
2   3   Amador
3   4   Butte
4   5   Calaveras

** Error: KeyError **

Error is raised on columns.get_loc(item)

KeyError                                  Traceback (most recent call last)
Input In [410], in <cell line: 2>()
      1 countynames.set_index('CountyCode')
----> 2 employee['County.Code'] = countynames.lookup(countynames.index, countynames['CountyCode'])

File ~\anaconda3\lib\site-packages\pandas\core\frame.py:4602, in DataFrame.lookup(self, row_labels, col_labels)
   4600     result = np.empty(n, dtype="O")
   4601     for i, (r, c) in enumerate(zip(row_labels, col_labels)):
-> 4602         result[i] = self._get_value(r, c)
   4604 if is_object_dtype(result):
   4605     result = lib.maybe_convert_objects(result)

File ~\anaconda3\lib\site-packages\pandas\core\frame.py:3615, in DataFrame._get_value(self, index, col, takeable)
   3612     series = self._ixs(col, axis=1)
   3613     return series._values[index]
-> 3615 series = self._get_item_cache(col)
   3616 engine = self.index._engine
   3618 if not isinstance(self.index, MultiIndex):
   3619     # CategoricalIndex: Trying to use the engine fastpath may give incorrect
   3620     #  results if our categories are integers that dont match our codes
   3621     # IntervalIndex: IntervalTree has no get_loc

File ~\anaconda3\lib\site-packages\pandas\core\frame.py:3931, in DataFrame._get_item_cache(self, item)
   3926 res = cache.get(item)
   3927 if res is None:
   3928     # All places that call _get_item_cache have unique columns,
   3929     #  pending resolution of GH#33047
-> 3931     loc = self.columns.get_loc(item)
   3932     res = self._ixs(loc, axis=1)
   3934     cache[item] = res

File ~\anaconda3\lib\site-packages\pandas\core\indexes\base.py:3623, in Index.get_loc(self, key, method, tolerance)
   3621     return self._engine.get_loc(casted_key)
   3622 except KeyError as err:
-> 3623     raise KeyError(key) from err
   3624 except TypeError:
   3625     # If we have a listlike key, _check_indexing_error will raise
   3626     #  InvalidIndexError. Otherwise we fall through and re-raise
   3627     #  the TypeError.
   3628     self._check_indexing_error(key)

KeyError: 1

>Solution :

Always hard without having the data.

But try:

employee['County.Code'].replace(countynames.set_index("CountyCode")["Value"].to_dict(), inplace=True)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading