Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Incorrect results when using last_value()

This is the table:

id region variety price
1 Alexander Valley Cabernet Sauvignon 35
2 Alexander Valley Cabernet Sauvignon 45
3 Alexander Valley Merlot 19
4 California Sauvignon Blanc 8
5 California Pinot Noir 17

I wanted to find out for each region, the cheapest and most expensive variety, so the output should be:

region expensive cheap
Alexander Valley Cabernet Sauvignon Merlot
California Pinot Noir Sauvignon Blanc

I was able to get the correct result using two first_value()

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

SELECT
  DISTINCT region,
  FIRST_VALUE(variety) OVER (PARTITION BY region ORDER BY price DESC) AS expensive,
  FIRST_VALUE(variety) OVER (PARTITION BY region ORDER BY price) AS cheapest
FROM wine_list

I thought it would be equivalent to the following query

SELECT
  DISTINCT region,
  FIRST_VALUE(variety) OVER (PARTITION BY region ORDER BY price DESC) AS expensive,
  LAST_VALUE(variety) OVER (PARTITION BY region ORDER BY price DESC) AS cheapest
FROM wine_list

However now my output is:

region expensive cheap
Alexander Valley Cabernet Sauvignon Cabernet Sauvignon
Alexander Valley Cabernet Sauvignon Merlot
California Pinot Noir Pinot Noir
California Pinot Noir Sauvignon Blanc

Why is my output wrong? I am so baffled.

>Solution :

The default window for FIRST_VALUE and LAST_VALUE is ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. I.e. it is the first resp. last value "so far".

You want it for the whole dataset, however, so you must explicitely describe the window range:

SELECT DISTINCT
  region,
  FIRST_VALUE(variety) OVER 
    (PARTITION BY region ORDER BY price DESC
     ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS expensive,
  LAST_VALUE(variety) OVER 
     (PARTITION BY region ORDER BY price DESC
      ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS cheapest
FROM wine_list;  
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading