I have a problem with my Python code. I’m using pandas to read a Dataset and store it in a Data Frame. I’m now trying to convert ug to mg (1000ug == 1 mg) and g to mg (1000 mg == 1g).
I’m first converting the Datatype of the column to float64
df[data_column] = df[data_column].astype("float64")
After that am, I’m selecting all the rows that contain values ug and multiplying them by 0.0001 and then the rows with g multiplying them with 1000
df.loc[df[unit_colum] == "g", [data_column]] *= 1000
df.loc[df[unit_colum] == "ug", [data_column]] *= 0.001
Btw:
I know that I also can devide values in pandas but this code should at the end run in a Loop where it also converts other values like (l -> ml).
My question now is:
Is there any chance that a Floating-Point error occures and what is the best way to prevent it.
I already thought about not converting the Data Frame columns into float64 and just work with the Strings. But this isn’t my prefered way.
>Solution :
It is difficult to fully avoid floating point errors in general.
You have two major options to avoid/limit them:
- perform your computations in the smallest available unit (here µg) as integers
- round the values to the desired precision after conversion
Also, a tip for your conversion, rather than using multiple lines you can map the factors:
factors = {'ug': 0.001, 'g': 1000, 'mg': 1}
df['data_column'] *= df['unit_column'].map(factors)