I am trying to switch values between Range & Unit column from below dataframe based on condition if Unit contains - then replace Unit to Range & Range to Unit. To do that I am creating a unit_backup column so that I don’t loose the original Unit value.
1. dataframe
sample_df = pd.DataFrame({'Range':['34-67',12,'gm','45-90'],
'Unit':['ml','35-50','10-100','mg']})
sample_df
Range Unit
0 34-67 ml
1 12 35-50
2 gm 10-100
3 45-90 mg
2. Function
Below is the code I have tried but I am getting error in this:
def range_unit_correction_fn(df):
# creating backup of Unit column
df['unit_backup'] = df['Unit']
# condition check
if df['unit_backup'].str.contains("-"):
# if condition is True then replace `Unit` value with `Range` and `Range` with `unit_backup`
df['Unit'] = df['Range']
df['Range'] = df['unit_backup']
else:
# if condition False then keep the same value
df['Range'] = df['Range']
# drop the backup column
df = df.drop(['unit_backup'],axis=1)
return df
- Applying above function on dataframe
sample_df = sample_df.apply(range_unit_correction_fn, axis=1)
sample_df
Error:
1061 def apply_standard(self):
1062 if self.engine == "python":
-> 1063 results, res_index = self.apply_series_generator()
...
----> 4 if df['unit_backup'].str.contains("-"):
5 df['Unit'] = df['Range']
6 df['Range'] = df['unit_backup']
AttributeError: 'str' object has no attribute 'str'
It seems like some silly mistake but I am not sure where am I going wrong.
Appreciate any sort of help here.
>Solution :
When you access df['unit_backup'], you get a scalar string value, not a pandas Series, so calling .str on it raises an error.
To fix it you can check the condition directly on the string value in a row-wise approach:
def range_unit_correction_fn(df):
# creating backup of Unit column
df['unit_backup'] = df['Unit']
# condition check
if '-' in df['unit_backup']:
# if condition is True then replace `Unit` value with `Range` and `Range` with `unit_backup`
df['Unit'] = df['Range']
df['Range'] = df['unit_backup']
# drop the backup column
df = df.drop(['unit_backup'])
return df