I am trying to write a query that will group by different combination of certain columns and give me ‘
N‘ records for each combination of the columns. I have been able to categorize it using a single column, but I am unable to get the grouping correct when it involves multiple columns. Below is an example of how the data in the table looks like and the output I am expecting. Here
'N' = 2.
Expected output :
- max 2 records for each unique combination of columns
- which 2 records are selected is not important.
- If there are
< Nrecords, then return all rows (i.e the no of records should not exceed
'N', but can be
UPDATE : Was able to get the desired output. My question now is – if there is any further efficient way to do it, given I will need to run it on a table containing several columns and the number of combinations will quickly increase.
WITH SUBSET AS ( SELECT [ServerName],[SKU],[Feature1],[Feature2],[Feature3],ROW_NUMBER() OVER ( PARTITION BY [SKU], [Feature1], [Feature2],[Feature3] order by [ServerName]) AS [ROW NUMBER] FROM [tablename] ) SELECT ServerName ,SKU ,Feature1 ,Feature2 ,Feature3 FROM [SUBSET] WHERE [SUBSET].[ROW NUMBER] <= 2
Use the ROW_NUMBER() function in the SQL query to assign a sequential number to each row within the specified grouping. After that filter the result to only include rows where the row number is less than or equal to ‘N’.
SELECT column1, column2, column3, column4, column5 FROM ( SELECT column1, column2, column3, column4, column5, ROW_NUMBER() OVER (PARTITION BY column1, column2, column3 ORDER BY column4, column5) AS row_num FROM custom_table ) t WHERE row_num <= 2 ORDER BY column1, column2, column3, column4, column5;
The PARTITION BY clause specifies the columns which are needed to group by, and the ORDER BY clause determines the order in which the rows will be numbered within each group.