I have a data frame like this:
df = pd.DataFrame({"product": [1,2,3,4,5], "company": ["A", "B","B","A","B"], "state": ["CA", "NY", "CA", "CA", "NY"]})
company state product
0 A CA 1
1 B NY 2
2 B CA 3
3 A CA 4
4 B NY 5
I would like a bool that picks out just the rows that correspond to states which have only one company in them. In this case that would be only NY which has only company B, so the desired bool would be [False, True, False, False, True]
Alternatively, I would like to know the set of states that have only one company in them. I guess I could do that e.g. using value_counts once I have the bool.
How do I do this?
Thanks!
>Solution :
You can use groupby_nunique with transform to broadcast the result over rows then just check if the result is equal to 1:
df['flag'] = df.groupby('state')['company'].transform('nunique').eq(1)
print(df)
# Output
product company state flag
0 1 A CA False
1 2 B NY True
2 3 B CA False
3 4 A CA False
4 5 B NY True