i have a dataframe, that has a column ‘A1’ that contains multiple ‘Hello’ strings, postive as well as negative integers. I want to count the ‘Hello’ strings, all number >= 0 and all numbers < 0, so that i get three sums in the end.
| index | A1 |
|---|---|
| 0 | 1 |
| 1 | Hello |
| 2 | -8 |
| 3 | Hello |
So the Output should be for posNums 1, negNums 1 and helloCount 2
posNums = df.where(df['A1'] >= 0).sum()
This doesnt work obviously, because one cant compare string to int. But how can I add here some condition that skips the str when I count ints and vice versa?
>Solution :
One way is to use pd.to_numeric:
import pandas as pd
df = pd.DataFrame({"A1": ["Hello", 1, -1, "Hello", "Hello", -2, 2, -3]})
agg_funcs = {
"negative": lambda x: x.lt(0).sum(),
"positive": lambda x: x.ge(0).sum(),
"nans": lambda x: x.isna().sum()
}
out = pd.to_numeric(df["A1"], errors="coerce").agg(agg_funcs)
out:
negative 3
positive 2
nans 3
Name: A1, dtype: int64