Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Do we need separate DB Index for a SQL query containing multiple OR conditions

For example I have a query:-

Select *
from table
where (column1 = 'A' and Column2 = 'B' and Column3 = 'C')
   OR (Column1= 'z' and Column4 = 'X');

If I have to create a DB index for this query,

Do we need separate DB index for the first half and second half of the OR condition of the query ?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

or

A single DB index would be enough containing all columns present in the query?

Note:- The query is going to be executed in Microsoft SQL Server.

>Solution :

TL;DR

Create 2 INDEXes, one on Columns1-3 and the other on Columns1 & 4, INCLUDE all the other columns in the table on both INDEXes, and then use a UNION ALL query instead.

Demonstration

As I mention in the comment, if you wanted to do this in a single query, without a UNION (unlike in Tim Biegeleisen‘s answer) then more likely you would want an INDEX on Column1, and then INCLUDE all the other columns in your table (all, because you use SELECT * and so return all the columns, even those in referenced in the WHERE).

I do think that Tim’s solution would likely be better though, especially with a much larger data set, however, that too should have some INCLUDEs in it. Without those then either a scan will be performed (which isn’t helpful here) or a potentially expensive key lookup.

Here’s a little demonstration with a sample table and images of the query plans.

Sample data:

CREATE TABLE dbo.YourTable (Column1 char(1),
                            Column2 char(1),
                            Column3 char(1),
                            Column4 char(1),
                            Column5 int);
GO

INSERT INTO dbo.YourTable
VALUES('A','B','C','D',7),
      ('A','B','C','F',9),
      ('A','B','D','Z',1),
      ('D','B','C','X',9),
      ('Z','B','C','X',7),
      ('Z','D','C','X',4),
      ('Z','B','C','Y',9);
GO

Create Tim’s indexes and test:

CREATE INDEX idx1_tim ON dbo.yourTable (Column1, Column2, Column3);
CREATE INDEX idx2_tim ON dbo.yourTable (Column1, Column4);
GO

SELECT *
FROM dbo.yourTable
WHERE Column1 = 'A'
  AND Column2 = 'B'
  AND Column3 = 'C'
UNION ALL
SELECT *
FROM dbo.yourTable
WHERE Column1 = 'z'
  AND Column4 = 'X';
GO

Query Plan demonstrating 2 table scans
Notice this have 2 table scans, as there are no INCLUDEs (this is not Column5‘s fault, and I demonstrate this later)

Single SELECT with an INDEX on Column1, but Column5 excluded from INCLUDE:

CREATE INDEX idx_larnu ON dbo.YourTable (Column1) INCLUDE(Column2, Column3, Column4);
GO
SELECT *
FROM dbo.YourTable
WHERE (Column1 = 'A' AND Column2 = 'B' AND Column3 = 'C')
   OR (Column1= 'z' AND Column4 = 'X');
GO

Query Plan demonstrating a table scan
Note this too has a table scan, due to the ommission of Column5. If we add that to the INCLUDE:

CREATE INDEX idx_larnu_fullinc ON dbo.YourTable (Column1) INCLUDE(Column2, Column3, Column4,Column5);
GO

SELECT *
FROM dbo.YourTable
WHERE (Column1 = 'A' AND Column2 = 'B' AND Column3 = 'C')
   OR (Column1= 'z' AND Column4 = 'X');
GO

Query plan demonstrating a single index scan
This unfortunately still results in a scan, but of the index this time not the table. For a larger table or for different values of Column1, SQL Server may make a different decision.

Now let’s just add an INCLUDE to Tim’s indexes:

--Drop mine so that they definitely aren't used
DROP INDEX idx_larnu_fullinc ON dbo.YourTable; 
DROP INDEX idx_larnu ON dbo.YourTable;
CREATE INDEX idx1_tim_inc ON dbo.yourTable (Column1, Column2, Column3) INCLUDE (Column4);
CREATE INDEX idx2_tim_inc ON dbo.yourTable (Column1, Column4) INCLUDE (Column2, Column3);
GO

SELECT *
FROM dbo.yourTable
WHERE Column1 = 'A'
  AND Column2 = 'B'
  AND Column3 = 'C'
UNION ALL
SELECT *
FROM dbo.yourTable
WHERE Column1 = 'z'
  AND Column4 = 'X';
GO

Query Plan demonstrating 2 seeks
Hazaar! This is what we want. Also, just to confirm however that Column5 wasn’t causing the scans earlier, let’s drop those INDEXes and also Column5:

DROP INDEX idx1_tim_inc ON dbo.YourTable;
DROP INDEX idx2_tim_inc ON dbo.YourTable;
ALTER TABLE dbo.YourTable DROP COLUMN Column5;
GO
SELECT *
FROM dbo.yourTable
WHERE Column1 = 'A'
  AND Column2 = 'B'
  AND Column3 = 'C'
UNION ALL
SELECT *
FROM dbo.yourTable
WHERE Column1 = 'z'
  AND Column4 = 'X';
GO

Query plan demonstrating 2 scans
So, there you can see that Column5 wasn’t the cause the scans

--Clean up
DROP TABLE dbo.YourTable;
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading