SQL count and filter query optimization

April 5, 2023

Given a database table article_bookmarks with columns userId and articleId, if a user bookmarks an article a table entry (userId, articleId) with respective ids is made.

I want to retrieve the total number of users that bookmarked a specific article given by articleId, together with the info if the user (given by userId) also bookmarked that article.

My SQL query currently looks like this:

SELECT COUNT(ar.userId) AS bookmark_count,
       EXISTS(SELECT 1 FROM article_bookmarks WHERE articleId=:articleId AND userId=:userId) AS user_bookmarked
FROM article_bookmarks ar
WHERE ar.articleId=:articleId

I have the impression that this query is not optimal, as it seems inefficient and redundant to read out basically the same userId data twice: One time aggregated in a "count", the other time filtered by a single userId with a sub-select.

Is there a way to optimize or simplify the given SQL query (i.e. something like SELECT COUNT(ar.userId) AS bookmark_count, EXISTS(:userId IN ar.userId) AS user_bookmarked […] in valid SQL)?

>Solution :

You can use conditional aggregation. Here the table article_bookmarks is scanned only once.

SELECT 
    COUNT(ar.userId) AS bookmark_count,
    MAX(CASE WHEN userId=:userId THEN 1 ELSE 0 END) AS user_bookmarked
FROM article_bookmarks ar
WHERE ar.articleId=:articleId

To avoid scanning the whole table you need an index on (articleId, userId) or (articleId) include (userId).