I have a dataframe with 3 features: id, name and point. I need to select rows that type of ‘point’ value is string.
| id | name | point |
|---|---|---|
| 0 | x | 5 |
| 1 | y | 6 |
| 2 | z | ten |
| 3 | t | nine |
| 4 | q | two |
How can I split the dataframe just looking by type of one feature’ value?
I tried to modify select_dtypes method but I lost. Also I tried to divide dataset with using
df[df[point].dtype == str] or df[df[point].dtype is str]
but didn’t work.
>Solution :
Technically, the answer would be:
out = df[df['point'].apply(lambda x: isinstance(x, str))]
But this would also select rows containing a string representation of a number ('5').
If you want to select "strings" as opposed to "numbers" whether those are real numbers or string representations, you could use:
m = pd.to_numeric(df['point'], errors='coerce')
out = df[df['point'].notna() & m]
The question is now, what if you have '1A' or 'AB123' as value?