I have pandas as below
data= [['A','hi'],['A','hi1'],['A','bye'],['A','bye2'],['B','hi2'],['B','hi'],['B','bye']]
df = pd.DataFrame(data,columns =['category','Value'])
I need to get the common values in a list in both A and B category i.e.,
[‘hi’,’bye’]
Currently I split the dataframe into two dataframes for A and B respectively and then applying set interaction for these two dataframes to get the common item for column ‘Value’. Please advice is there way without splitting them into two dataframes.
>Solution :
You can use a set.intersection:
For A and B only:
out = (set(df.loc[df['category'].eq('A'), 'Value'])
&set(df.loc[df['category'].eq('B'), 'Value'])
)
Generic method for all groups:
out = set.intersection(*df.groupby('category')['Value'].agg(set))
Output: {'bye', 'hi'}