I’m surprised I couldn’t find anything about this:
Q: How can I convert values of a column to numbers with a default value if casting fails.
I need all values to be numbers in order to perform arithmetical functions on them. The raw input data contains non-convertible values like "","**", "not available", ….
E.g.
1
2
*
4
Should become:
1
2
0
4
What I’ve tried:
to_numberic: Does not allow to set a default value.errors=coercewill set the value toNaN, which can not be handled arithmetically afterwards.df[COLUMN].apply (makeNumber)withdef makeNumber(value): return 0.0does apparently nothing on the string values.
>Solution :
If no missing values in original use:
df['col'] = pd.to_numeric(df['col'], errors='coerce').fillna(0)
If need downcast like comment solution:
df['col'] = pd.to_numeric(df['col'], errors='coerce').fillna(0, downcast="infer")
print (df)
col
0 1
1 2
2 0
3 4
If possible missing values in original and need not replace them:
print (df)
col
0 1
1 2
2 *
3 4
4 NaN
s = pd.to_numeric(df['col'], errors='coerce')
df['col'] = s.mask(s.isna() & df['col'].notna(), 0)
print (df)
col
0 1.0
1 2.0
2 0.0
3 4.0
4 NaN