Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Change values in a dataframe column with mixed types based on a condition

One column of my dataset has both strings and floats. In that column, for each string I am trying to replace it with only the first 5 characters of the string.

def isfloat(num):
    try:
        float(num)
        return True
    except ValueError:
        return False

df = pd.DataFrame([[1, "Alligator"], [1, 3], [4, "Markets"]], columns=['A', 'B'])

The following two methods don’t seem to change the actual dataframe.

df['B'].apply(lambda x: float(x) if isfloat(x) else x[0:5])

for index, row in df.iterrows():
    if not isfloat(row.B):
        row.B = row.B[0:5]

This next method results in the warning "cannot convert the series to <class ‘float’>", I think because the isfloat method cannot be called in this way.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df['B'] = np.where(not isfloat(df['B']), df['B'][0:5], df['B'])

I tried using .loc as well but it did not seem suitable because of the condition I need to base the change on. How would one go about this, or what am I missing?

>Solution :

I believe you need:

df['B']=df['B'].apply(lambda x: float(x) if isfloat(x) else x[0:5])

Since DataFrames are not edited in place.

Output:

   A      B
0  1  Allig
1  1    3.0
2  4  Marke
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading