I’m working a personal project and I have this table called Movies, Formatted as such:
| Movie_Title | Genre | Release_Year |
|---|---|---|
| John Wick 4 | Action-Thriller-Crime | 2023 |
And I split it up by genre into a table as below because
I want to count the number of genres create per Release Year:
| Movie_Title | Genre | Release_Year |
|---|---|---|
| Avatar | Action | 2022 |
| Avatar | Adventure | 2022 |
| Avatar | Sci-Fi | 2022 |
| Ant-Man | Action | 2022 |
| Ant-Man | Adventure | 2022 |
| Ant-Man | Sci-Fi | 2022 |
| John Wick 4 | Action | 2023 |
| John Wick 4 | Thriller | 2023 |
| John Wick 4 | Crime | 2023 |
To do that I tried to partition it by two columns ‘Genre’ and ‘Release_Year’ using this Query:
Select Genre, Count(Genre) OVER (Partition by Genre,Release_Year) as Num_genre, Release_Year
FROM Movies
My Expected outcome is a table as such:
| Genre | Num_genre | Release_Year |
|---|---|---|
| Action | 2 | 2022 |
| Sci-Fi | 2 | 2022 |
| Adventure | 2 | 2022 |
| Action | 1 | 2023 |
| Thriller | 1 | 2023 |
| Crime | 1 | 2023 |
However when I run that query I end up getting a lot of dupes (same genre same year same count) back and it ends up looking like this:
Any help would be really appreciated thanks!
>Solution :
Use the GROUP BY aggregation clause:
SELECT
Genre,
COUNT(1) AS Num_genre,
Release_Year
FROM
Movies
GROUP BY
Genre,
Release_Year
;