Home Frequency of categories in a column conditioned on another column for each observation

Questions

Frequency of categories in a column conditioned on another column for each observation

May 3, 2023

I have multiple columns in my dataset, two of which are "id" and "Sentiment". I am trying to find the frequency of each of the sentiments for each of the ids in the dataset and add them in a new column in the same dataset. I have tried multiple commands, but have not been able to get the correct frequency. One of the commands that should logically work is as follows:

Sample DataFrame:

data = {'id': ['205', '205', '204', '204', '204'], 'First_name': ['Jon','Bill','Maria','Emma', 'Bee'], 
     'Sentiment': ['Positive', 'Positive', 'Neutral', 'Positive', 'Positve']}
df = DataFrame(data)

and the commands that I tried:

for x in df['id']:
  df['sent_freq'] = df.Sentiment.map(df.Sentiment.value_counts())

df['sent_freq'] = df.groupby('Sentiment')['id'].transform('count')

The output that I get from both is:

    id  First_name  Sentiment   sent_freq
0   205 Jon     Positive        3
1   205 Bill    Positive        3
2   204 Maria   Neutral         1
3   204 Emma    Positive        3
4   204 Bee     Positve         1

which is wrong, as it should be

    id  First_name  Sentiment   sent_freq
0   205 Jon     Positive        2
1   205 Bill    Positive        2
2   204 Maria   Neutral         1
3   204 Emma    Positive        2
4   204 Bee     Positve         2

Any leads will be highly appreciated.

>Solution :

Example

your example code have something wrong. i fix it

data = {'id': ['205', '205', '204', '204', '204'], 'First_name': ['Jon','Bill','Maria','Emma', 'Bee'], 
     'Sentiment': ['Positive', 'Positive', 'Neutral', 'Positive', 'Positive']}
df = pd.DataFrame(data)

Code

‘count’ -> pd.Series.nunique

df['sent_freq'] = df.groupby('Sentiment')['id'].transform(pd.Series.nunique)

output:

    id  First_name  Sentiment       sent_freq
0   205 Jon         Positive        2
1   205 Bill        Positive        2
2   204 Maria       Neutral         1
3   204 Emma        Positive        2
4   204 Bee         Positive        2