How to optimise MYSQL query which uses subquery

In my project I have a use case where we need to trigger an API on the first of every month to send gift cards to users via email who have their birthday in current month.

I have a SQL query that fetches user information having their birthday in a given month and have not yet received gift cards.

NOTE: Gift cards only needs to be sent to a user once every year.

I wrote a query in the method which is being called by the API endpoint, but the API is giving timeout error due to large number of users meeting the criteria. Also after fetching the users data, heavy write operations are being performed along with calling 3rd party service to send emails to the users about the gift code.

Below is the query that needs to be optimised:

SELECT id,
first_name,
last_name,
email,
date_of_birth
FROM `user`
WHERE DATE_FORMAT(date_of_birth, '%m') = 6
AND email IS NOT NULL AND status != 0
AND email NOT IN
(
    SELECT recipient_email FROM gift_card WHERE
    DATE_FORMAT(scheduled_at, '%m') = 6
    AND
    DATE_FORMAT(scheduled_at, '%Y') = 2022
    AND
    message = 'Happy birthday! from BURST'
    AND
    status = 1
);

I tried using NOT EXISTS which takes even more time.

SELECT id,
first_name,
last_name,
email,
date_of_birth
FROM `user`
WHERE DATE_FORMAT(date_of_birth, '%m') = 6
AND email IS NOT NULL AND status != 0
AND NOT EXISTS
(
    SELECT 1 FROM gift_card WHERE
    DATE_FORMAT(scheduled_at, '%m') = 6
    AND
    DATE_FORMAT(scheduled_at, '%Y') = 2022
    AND
    message = 'Happy birthday! from BURST'
    AND
    status = 1
    AND
    recipient_email = email
);

Please ask for more clarification in case I missed something before downvoting.

>Solution :

First, I would rewrite your subquery to remove DATE_FORMAT, which would preclude the use any index:

SELECT id, first_name, last_name, email, date_of_birth
FROM user
WHERE DATE_FORMAT(date_of_birth, '%m') = 6 AND
      email IS NOT NULL AND status != 0 AND
      NOT EXISTS
(
    SELECT 1
    FROM gift_card
    WHERE scheduled_at >= '2022-06-01' AND scheduled_at < '2022-07-01' AND
          message = 'Happy birthday! from BURST' AND
          status = 1 AND
          recipient_email = email
);

You may try adding the following index to the gift_card table:

CREATE INDEX idx ON gift_card (scheduled_at, message, status, recipient_email);

If used, the above index should allow MySQL to do a rapid lookup of each record in the outer query against the gift_card subquery.

The join version of this might be:

SELECT u.id, u.first_name, u.last_name, u.email, u.date_of_birth
FROM user u
LEFT JOIN gift_card g
    ON g.scheduled_at >= '2022-06-01' AND g.scheduled_at < '2022-07-01' AND
       g.message = 'Happy birthday! from BURST' AND
       g.status = 1 AND
       g.recipient_email = g.email
WHERE
    DATE_FORMAT(u.date_of_birth, '%m') = 6 AND
    u.email IS NOT NULL AND u.status != 0 AND
    g.email IS NULL;

Leave a Reply