# Summing values in column of dataframe according to nonunique values in another column

I am learning pandas, and have come across a super easy problem, but not sure how to solve it. I have a dataframe with 2 columns. One column is categorical with three levels i.e. the values are non-unique, and one column is quantitative. For example,

``````df = pd.DataFrame({"X": list("AAABBC"), "Y": range(6)})
``````

I want to sum each value in the `Y` column according to each unique-value in the `X` column. I.e. I should obtain a dataframe with 2 columns; one column being the unique values in `X` i.e. `["A","B","C"]` and the other column being the summed values in Y corresponding to each level in X i.e.`[3,7,5]`.

This is obviously quite a basic thing to do, but I have tried googling and I couldn’t find the answer, so it’s quite frustrating. I think the answer should be quite simply, probably a one-liner, but I just don’t know the command. I am quite new to pandas, so please go easy 🙂

### >Solution :

You are looking for the `.groupby` method. Relevant docs:

Usage example:

``````import pandas as pd

df = pd.DataFrame({"X": list("AAABBC"), "Y": range(6)})

df_by_x = df.groupby("X").sum()

print(df_by_x)
``````

Result:

``````   Y
X
A  3
B  7
C  5
``````

As I’m sure you can imagine, there are many many other things you can do with grouping in Pandas. There are a handful of different ways to solve the problem you proposed here, although the "one obvious way to do it" would be the `.sum()` method as above. However, it might be a useful learning exercise to work through some of the other ways you might accomplish the same task.

Another related search term for this operation is "split-apply-combine", but that was somewhat of a short-lived trend in terminology (mostly confined to R users), while "group by" is a long-standing term inherited from SQL.