I am learning pandas, and have come across a super easy problem, but not sure how to solve it. I have a dataframe with 2 columns. One column is categorical with three levels i.e. the values are non-unique, and one column is quantitative. For example,

```
df = pd.DataFrame({"X": list("AAABBC"), "Y": range(6)})
```

I want to sum each value in the `Y`

column according to each unique-value in the `X`

column. I.e. I should obtain a dataframe with 2 columns; one column being the unique values in `X`

i.e. `["A","B","C"]`

and the other column being the summed values in Y corresponding to each level in X i.e.`[3,7,5]`

.

This is obviously quite a basic thing to do, but I have tried googling and I couldn’t find the answer, so it’s quite frustrating. I think the answer should be quite simply, probably a one-liner, but I just don’t know the command. I am quite new to pandas, so please go easy ðŸ™‚

### >Solution :

You are looking for the `.groupby`

method. Relevant docs:

- https://pandas.pydata.org/pandas-docs/stable/user_guide/groupby.html
- https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.groupby.html

Usage example:

```
import pandas as pd
df = pd.DataFrame({"X": list("AAABBC"), "Y": range(6)})
df_by_x = df.groupby("X").sum()
print(df_by_x)
```

Result:

```
Y
X
A 3
B 7
C 5
```

As I’m sure you can imagine, there are many many other things you can do with grouping in Pandas. There are a handful of different ways to solve the problem you proposed here, although the "one obvious way to do it" would be the `.sum()`

method as above. However, it might be a useful learning exercise to work through some of the other ways you might accomplish the same task.

Another related search term for this operation is "split-apply-combine", but that was somewhat of a short-lived trend in terminology (mostly confined to R users), while "group by" is a long-standing term inherited from SQL.