MySQL NOT IN subquery removing more lines that contained in the subquery

I’m having results I don’t understand from a MySQL query that should be pretty simple.
The goal I want to achieve is more complex, but I have reduced the problem to this: When I am using NOT IN subquery in a WHERE clause, it even removes lines that are not included in the subquery. In fact I get an empty result.

Here are 3 simple queries that show my problem:

SELECT DISTINCT tas.parent_id
FROM tasks tas
WHERE tas.deleted = 0

This one returns 10770 ids

SELECT DISTINCT tas.parent_id
FROM tasks tas
WHERE tas.status = 'Not Started'
AND tas.deleted = 0

This one returns 2028 ids

I was expecting the last one to return 10770 – 2028 = 8742 ids

SELECT DISTINCT tas.parent_id
FROM tasks tas
WHERE tas.parent_id NOT IN
(
SELECT DISTINCT tas2.parent_id
    FROM tasks tas2
    WHERE tas2.status = 'Not Started'
    AND tas2.deleted = 0
)
AND tas.deleted = 0

But it returns an empty result.

I feel stupid because there is probably a silly mistake somewhere, and I don’t understand what it is.

Please note that I KNOW is twisted, I could have added tas.status != 'Not Started' instead of relying on a subquery. But it is part of a more complex query, I only show the part that doesn’t make sense to me

Thank you for your help. Also this is my first time asking a question on Stack overflow so let me know if there are things I should change and keep in mind next time.

>Solution :

It looks like your query is correct and should return the expected result. However, there could be a few reasons why you’re getting an empty result:

  • The subquery is returning all the parent_id values in the table. In
    that case, your main query will return an empty result because there
    are no parent_id values that are not in the subquery. You can check
    this by running the subquery on its own and comparing the result to
    the full list of parent_id values.

  • There could be NULL values in the parent_id column. When using the
    NOT IN operator with a subquery, it’s important to check for NULL
    values explicitly. For example, you can add OR tas.parent_id IS NULL to your query to include any NULL values in the result.

Leave a Reply