Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Find lagging rows of a query omitted by a WHERE clause in SQL/SQLite

Setup

I have a table of race times, listing a runner, their team, and their race time:

CREATE TABLE race (person TEXT, team TEXT, timer FLOAT);

INSERT INTO race
  (person, team, timer)
VALUES
  ("ahmed", "red", 4.3),
  ("baadur", "green", 4.4),
  ("carel", "red", 4.5),
  ("dada", "green", 4.9),
  ("eder", "green", 5.0),
  ("farai", "red", 5.1);

I can make a list of all people on the red team and their ranking:

SELECT person, ROW_NUMBER() OVER(ORDER BY timer) AS ranking FROM race WHERE team="red";

which emits

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

person ranking
ahmed 1
carel 2
farai 3

Question

I want to also get the name of runner who followed each of these red runners, i.e., who had the next slowest time—so I want:

person ranking next runner
ahmed 1 baadur
carel 2 dada
farai 3 null

where note how since nobody has a slower time than Farai, Farai’s third column is null.

Can I do this efficiently with a single query?

Considerations

I’d like to avoid first getting the list of red runners and their times with one query and then making another three (or more generally N) queries to get the runner with the next time, e.g., this is what I do not want to do:

SELECT person FROM race WHERE timer>=4.3 AND person != "ahmed" LIMIT 1;
SELECT person FROM race WHERE timer>=4.5 AND person != "carel" LIMIT 1;
SELECT person FROM race WHERE timer>=5.1 AND person != "farai" LIMIT 1;
-- 😭

I might be able to rework this above approach to be a single query instead of multiple individual ones but I feel like it should be possible to get what I want in a single query if there was a way to run a subquery for each row that matches the WHERE team="red" clause to find the next row (which could be made fast by an index on timer), but I’m not sure if that’s possible.

For example, could I use SQLite’s lag window function to achieve this? By itself, lag would look at rows that met my WHERE team="red" criterion so it wouldn’t return the next-slowest runner if they were on the green team or another non-red team.

Is there a general term for this kind of query?

There could be many teams and many runners so I’d like to know how to make this kind of lookup as efficient as possible.

>Solution :

Use LAG here, along with ROW_NUMBER with partition:

WITH cte AS (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY team ORDER BY timer) ranking,
              LEAD(person) OVER (ORDER BY timer) next_runner
    FROM race
)

SELECT person, ranking, next_runner
FROM cte
WHERE team = 'red'
ORDER BY ranking;

screen capture from demo link below

Demo

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading