I have recently started learning SQL, set up a server, tried to get some practice with a kaggle DB I have found.
I would like to see the recorder on each lap (meaning track name field should be distinct), however it lists all lap records instead.
I have tried:
select distinct
r.name, min(t.fastestlaptime) as 'Fastest lap', d.surname, d.driverId, r.year
from
races r
inner join
results t on r.raceID = t.raceID
inner join
drivers d on t.driverID = d.driverID
where
(t.fastestlaptime not in ('\N'))
group by
r.name, d.forename, d.surname, d.driverId, r.year
order by
r.name
I get this result:
name | Fastest lap | surname | driverId | year |
---|---|---|---|---|
70th Anniversary Grand Prix | 1:28.451 | Hamilton | 1 | 2020 |
70th Anniversary Grand Prix | 1:29.465 | Verstappen | 830 | 2020 |
70th Anniversary Grand Prix | 1:29.477 | Albon | 848 | 2020 |
Abu Dhabi Grand Prix | 1:26.103 | Verstappen | 830 | 2021 |
Abu Dhabi Grand Prix | 1:26.419 | Pérez | 815 | 2021 |
Abu Dhabi Grand Prix | 1:26.615 | Hamilton | 1 | 2021 |
Australian Grand Prix | 1:28.321 | Villeneuve | 35 | 2006 |
Australian Grand Prix | 1:28.336 | Pérez | 815 | 2017 |
Australian Grand Prix | 1:28.416 | Glock | 10 | 2009 |
ofc, I just copy pasted some of the results, there are 7.6k rows, instead of just showing the rows with unique track names.
Can someone please explain to me? I’m malding not understanding the concept 😀
I would like to list distinct track names, and the lap record, name of the recorder, recorder ID and year next to it.
>Solution :
You probably want fastest lap per track? This is usually solved not by aggregates but with ROW_NUMBER(), something like:
select *
from
(
select *
, row_number() over(partition by name order by try_cast([Fastest lap] as time)) as sort
from (
VALUES (N'70th Anniversary Grand Prix', N'1:28.451', N'Hamilton', 1, 2020)
, (N'70th Anniversary Grand Prix', N'1:29.465', N'Verstappen', 830, 2020)
, (N'70th Anniversary Grand Prix', N'1:29.477', N'Albon', 848, 2020)
, (N'Abu Dhabi Grand Prix', N'1:26.103', N'Verstappen', 830, 2021)
, (N'Abu Dhabi Grand Prix', N'1:26.419', N'Pérez', 815, 2021)
, (N'Abu Dhabi Grand Prix', N'1:26.615', N'Hamilton', 1, 2021)
, (N'Australian Grand Prix', N'1:28.321', N'Villeneuve', 35, 2006)
, (N'Australian Grand Prix', N'1:28.336', N'Pérez', 815, 2017)
, (N'Australian Grand Prix', N'1:28.416', N'Glock', 10, 2009)
) t (name,[Fastest lap],surname,driverId,year)
) x
where sort = 1
This row_number() over(partition by name order by try_cast([Fastest lap] as time))
creates a sort column from 1 to … which groups by track name and then orders the results by the fastest lap. I use try_cast to avoid strange times, you should not use strings as well.
Then you have to wrap your select in a subquery because ROW_NUMBER() window function cannot appear directly in a WHERE clause, thus: where sort = 1