So I have data similar to this:
import pandas as pd
df = pd.DataFrame({'Order ID':[555,556,557,558,559,560,561,562,563,564,565,566],
'State':["MA","MA","MA","MA","MA","MA","CT","CT","CT","CT","CT","CT"],
'County':["Essex","Essex","Essex","Worcester","Worcester","Worcester","Bristol","Bristol","Bristol","Hartford","Hartford","Hartford"],
'AP':[50,50,75,100,100,125,150,150,175,200,200,225]})
but I need to add a column that shows the mode of AP grouped by State and County. I can get the mode this way:
(df.groupby(['State', 'County']).AP.agg(Mode = (lambda x: x.value_counts().index[0])).reset_index().round(0))
I’m just not sure how I can get that data added to the original data so that it looks like this:
| Order ID | State | County | AP | Mode |
|---|---|---|---|---|
| 555 | MA | Essex | 50 | 50 |
| 556 | MA | Essex | 50 | 50 |
| 557 | MA | Essex | 75 | 50 |
| 558 | MA | Worcester | 100 | 100 |
| 559 | MA | Worcester | 100 | 100 |
| 560 | MA | Worcester | 125 | 100 |
| 561 | CT | Bristol | 150 | 150 |
| 562 | CT | Bristol | 150 | 150 |
| 563 | CT | Bristol | 175 | 150 |
| 564 | CT | Hartford | 200 | 200 |
| 565 | CT | Hartford | 200 | 200 |
| 566 | CT | Hartford | 225 | 200 |
>Solution :
Use GroupBy.transform for new column:
df['Mode'] = (df.groupby(['State', 'County']).AP
.transform(lambda x: x.value_counts().index[0]))
Or Series.mode:
df['Mode'] = df.groupby(['State', 'County']).AP.transform(lambda x: x.mode().iat[0])