Find the percentage cutoff for brand added by a customer

I have a pandas dataframe where I wanted to find the percentage cutoff for most prevalent brand customer added to his cart.(New column: Percentage Cutoff for brand)

customerID Date Brand Brand Count Percentage Cutoff for brand
1 1-1-2021 Tomy Hilfigure 3 75%
1 1-1-2021 Adidas 1 75%
2 2-1-2021 Club Room 2 66%
2 2-1-2021 Levis 1 66%
2 3-1-2021 Adidas 4 50%
2 3-1-2021 Polo 4 50%

For customer 1, the percentage cutoff will be 75% as he has added 3 items of Tomy Hilfigure brand in his cart and 1 item of Adidas brand(25%) hence the percentage cutoff for the customer 1 is 75% for date 1-1-2021.

For customer 2, on date 2-1-2021, the percentage cutoff will be 66.67% as he added 2 items of Club room brand(66.66%) and 1 item of Levis brand(33%).

I am using pandas group by function but couldn’t able to find the " Percentage Cutoff for brand".It would be great if you could give me a direction. Thank you.

>Solution :

Let me know if the cutoff calculation logic is not right, I used max_brand_count / total_brand_count

# Grouping by customerID and Date to calculate max and total brand count
gdf = df.groupby(['customerID', 'Date']) \
        max_brand_count=('Brand Count', 'max'),
        total_brand_count=('Brand Count', 'sum')
    ) \

# Calculate Percentage Cutoff for brand by dividing max and total brand counts
gdf['Percentage Cutoff for brand'] = gdf['max_brand_count'] / gdf['total_brand_count']
# Formatting it to percentage
gdf['Percentage Cutoff for brand'] = ['{:,.2%}'.format(val) for val in gdf['Percentage Cutoff for brand']]

Output of this groupby:

enter image description here

You can merge this to your original df if you want to have it all together.

final_df = df.merge(gdf, how='left', on=['customerID', 'Date'])

Leave a Reply