One column of my dataset has both strings and floats. In that column, for each string I am trying to replace it with only the first 5 characters of the string.
def isfloat(num):
try:
float(num)
return True
except ValueError:
return False
df = pd.DataFrame([[1, "Alligator"], [1, 3], [4, "Markets"]], columns=['A', 'B'])
The following two methods don’t seem to change the actual dataframe.
df['B'].apply(lambda x: float(x) if isfloat(x) else x[0:5])
for index, row in df.iterrows():
if not isfloat(row.B):
row.B = row.B[0:5]
This next method results in the warning "cannot convert the series to <class ‘float’>", I think because the isfloat method cannot be called in this way.
df['B'] = np.where(not isfloat(df['B']), df['B'][0:5], df['B'])
I tried using .loc as well but it did not seem suitable because of the condition I need to base the change on. How would one go about this, or what am I missing?
>Solution :
I believe you need:
df['B']=df['B'].apply(lambda x: float(x) if isfloat(x) else x[0:5])
Since DataFrames are not edited in place.
Output:
A B
0 1 Allig
1 1 3.0
2 4 Marke