SQL DISTINCT MIN() – What am I doing wrong?

I have recently started learning SQL, set up a server, tried to get some practice with a kaggle DB I have found.

I would like to see the recorder on each lap (meaning track name field should be distinct), however it lists all lap records instead.

I have tried:

select distinct 
    r.name, min(t.fastestlaptime) as 'Fastest lap', d.surname,  d.driverId, r.year
from
    races r
inner join 
    results t on r.raceID = t.raceID
inner join 
    drivers d on t.driverID = d.driverID
where 
    (t.fastestlaptime not in ('\N'))
group by 
    r.name, d.forename, d.surname, d.driverId, r.year
order by 
    r.name

I get this result:

name Fastest lap surname driverId year
70th Anniversary Grand Prix 1:28.451 Hamilton 1 2020
70th Anniversary Grand Prix 1:29.465 Verstappen 830 2020
70th Anniversary Grand Prix 1:29.477 Albon 848 2020
Abu Dhabi Grand Prix 1:26.103 Verstappen 830 2021
Abu Dhabi Grand Prix 1:26.419 Pérez 815 2021
Abu Dhabi Grand Prix 1:26.615 Hamilton 1 2021
Australian Grand Prix 1:28.321 Villeneuve 35 2006
Australian Grand Prix 1:28.336 Pérez 815 2017
Australian Grand Prix 1:28.416 Glock 10 2009

1

ofc, I just copy pasted some of the results, there are 7.6k rows, instead of just showing the rows with unique track names.

Can someone please explain to me? I’m malding not understanding the concept 😀

I would like to list distinct track names, and the lap record, name of the recorder, recorder ID and year next to it.

>Solution :

You probably want fastest lap per track? This is usually solved not by aggregates but with ROW_NUMBER(), something like:

select *
from 
(
        select  *
        ,   row_number() over(partition by name order by try_cast([Fastest lap] as time)) as sort 
        from (
        VALUES  (N'70th Anniversary Grand Prix', N'1:28.451', N'Hamilton', 1, 2020)
        ,   (N'70th Anniversary Grand Prix', N'1:29.465', N'Verstappen', 830, 2020)
        ,   (N'70th Anniversary Grand Prix', N'1:29.477', N'Albon', 848, 2020)
        ,   (N'Abu Dhabi Grand Prix', N'1:26.103', N'Verstappen', 830, 2021)
        ,   (N'Abu Dhabi Grand Prix', N'1:26.419', N'Pérez', 815, 2021)
        ,   (N'Abu Dhabi Grand Prix', N'1:26.615', N'Hamilton', 1, 2021)
        ,   (N'Australian Grand Prix', N'1:28.321', N'Villeneuve', 35, 2006)
        ,   (N'Australian Grand Prix', N'1:28.336', N'Pérez', 815, 2017)
        ,   (N'Australian Grand Prix', N'1:28.416', N'Glock', 10, 2009)
        ) t (name,[Fastest lap],surname,driverId,year)
    ) x
where sort = 1

This row_number() over(partition by name order by try_cast([Fastest lap] as time)) creates a sort column from 1 to … which groups by track name and then orders the results by the fastest lap. I use try_cast to avoid strange times, you should not use strings as well.

Then you have to wrap your select in a subquery because ROW_NUMBER() window function cannot appear directly in a WHERE clause, thus: where sort = 1

Leave a Reply