SQL: Understanding the NOT IN function

I am working with Netezza SQL.

I have the following table:

   student_id years
1         123  2010
2         123  2011
3         123  2012
4         124  2010
5         124  2011
6         124  2012
7         124  2013
8         125  2010
9         125  2011
10        125  2012
11        125  2013
12        125  2014

My Question: I only want to select students if they ONLY have years = (2010,2011,2012,2013). This means that in the above table, ONLY student_id = 124 will be selected.

Here is my query using the NOT IN function (first use NOT IN to choose students that do not meet the condition … then use NOT IN to exclude students that do not meet the condition) :

SELECT student_id, years
FROM my_data
WHERE student_id NOT IN (
    SELECT student_id
    FROM my_data
    WHERE years NOT IN (2010, 2011, 2012, 2013)
)

However, this query is incorrectly returning student 123 and student 124:

  student_id years
1        123  2010
2        123  2011
3        123  2012
4        124  2010
5        124  2011
6        124  2012
7        124  2013

Can someone please show me how to fix this? The final answer should only return student 124.

Thanks!

>Solution :

I would use an aggregation approach here

WITH cte AS (
    SELECT student_id
    FROM my_data
    GROUP BY student_id
    HAVING COUNT(CASE WHEN years NOT IN (2010, 2011, 2012, 2013)
                      THEN 1 END) = 0 AND
           COUNT(DISTINCT years) = 4
)

SELECT student_id, years
FROM my_data
WHERE student_id IN (SELECT student_id FROM cte);

Leave a Reply