I have 2 tables in my schema:
1 – covid_data (relevant columns: country, year, week, cases). Data example:
+-------+------+-----+-------+
|country| year | week| cases |
+-------+------+-----+-------+
|USA | 2022 | 1 | 58488 |
|USA | 2022 | 2 | 65154 |
|USA | 2022 | 3 | 74154 |
|USA | 2022 | 4 | 84158 |
|USA | 2022 | 5 | 96997 |
+-------+------+-----+-------+
countries_of_the_world (relevant column: country)
+-------+
|country|
+--------
|USA |
|France |
|Brazil |
|Germany|
|China |
+-------+
I need to retrieve the most recent record for each country in countries_of_the_world. The rule for the "most recent record" is determined by higher value in year and higher value in week. For example, right now we are at week 6, year 2022.
I did the following code:
SELECT country, max(year), max(week)
FROM covid_data
WHERE week in (
SELECT MAX(week)
FROM covid_data
WHERE year in (
SELECT MAX(year)
FROM covid_data
GROUP BY year)
GROUP BY week)
GROUP BY country;
but is retrieving like the infos apart. For example week 53 for year 2022, because this week is "available" for the past year:
+--------------------+
|country| year | week|
+--------------------+
|USA | 2022 | 53 |
+--------------------+
How can I do this query including joining with every country with the other table?
>Solution :
Untested but does the following give what you are after?
with c as (
select country, cases,
Row_Number() over(partition by country order by year desc, week desc) rn
from covid_data
)
select country, cases
from c
where rn = 1;