Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

LEFT JOIN with tables having boolean data producing unexpected result set

A Table_1 with only column BOOLVALUE(int) having records as

  1. 1
  2. 1
  3. 0
  4. 0
  5. 0

and another Table_2 with only column BOOLVALUE(int) having records as

  1. 1
  2. 1
  3. 1
  4. 0
  5. 0

.. I am trying to run a query

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

select t1.BOOLVALUE from Table_1 t1
left join Table_2 t2 on t1.BOOLVALUE=t2.BOOLVALUE

and to my surprise output is not what I expected.There are 12 rows with 6 1’s and 6 0’s. But doesn’t this invalidates how joins work ?

>Solution :

12 rows is completely expected as you have 2 rows related to 3 rows, resulting in 6 rows, and 3 rows related to 2 rows resulting in 6 rows; add these together and you get 12.

When you JOIN all related rows are JOINed based on the ON clause. Your ON clause is t1.BOOLVALUE=t2.BOOLVALUE. This means all the 1s inTable_1 relate to all the 1s in Table_2; so that’s 2 rows related to 3 rows (2 * 3). Then all the 0s inTable_1 relate to all the 0s in Table_2; so that’s 3 rows related to 2 rows (3 * 2). Hence (2 * 3) + (3 * 2) = 6 + 6 = 12.

If we add an ID column to the table, this might become a little clearer.

Let’s say you have 2 tables like this:

ID1 I1
1 1
2 1
3 0
4 0
5 0
ID2 I2
1 1
2 1
3 1
4 0
5 0

Then lets say you have the following query:

SELECT T1.ID1,
       T2.ID2,
       T1.I1,
       T2.I2
FROM dbo.Table1 T1
     JOIN dbo.Table2 T2 ON T1.I1 = T2.I2
ORDER BY T1.ID1
         T2.ID2;

This would result in the following data set:

ID1 ID2 I1 I2
1 1 1 1
1 2 1 1
1 3 1 1
2 1 1 1
2 2 1 1
2 3 1 1
3 4 0 0
3 5 0 0
4 4 0 0
4 5 0 0
5 4 0 0
5 5 0 0

Here you can see you have a many to many join, and where the "extra" rows are coming from.

If you LEFT JOINed on the ID and I columns, starting at Table1, you would get 5 rows, with 1 row have NULL values for ID2 and I2:

SELECT T1.ID1,
       T2.ID2,
       T1.I1,
       T2.I2
FROM dbo.Table1 T1
     LEFT JOIN dbo.Table2 T2 ON T1.ID1 = T2.ID1
                            AND T1.I1 = T2.I2
ORDER BY T1.ID1
         T2.ID2;
ID1 ID2 I1 I2
1 1 1 1
2 2 1 1
3 NULL 0 NULL
4 4 0 0
5 5 0 0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading