Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Performing a JOIN with an ORDER BY and LIMIT in the joined table in SQL/Postgres

I have a table of companies and a table of transcripts that are related to those companies — many-to-one.

I want to be able to list companies that have not been checked recently AND do not have transcripts published recently.

The problem I’m running into is when I join the tables, I can’t figure out how to only limit the second condition to the most recent transcript. My query is unintentionally returning companies that have recent transcripts if it detects a transcript that is older.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Company Table

id name last_checked_at
1 ACME 2022-10-11 02:50:52.184975+00
2 MeepMeep 2022-05-12 02:50:52.184975+00
3 TNT 2022-05-12 02:50:52.184975+00

Transcripts Table

id company published_at
5 1 2022-10-11 02:50:52.184975+00
6 2 2022-10-11 02:50:52.184975+00
7 2 2022-05-12 02:50:52.184975+00
8 3 2022-06-11 02:50:52.184975+00
9 3 2022-03-12 02:50:52.184975+00

Desired Logic

  • Select company.ids
  • Where company.last_checked_at is older than 1 week from today
  • And only the most recent related transcript.published_at is older than 3 months from today

Expected Behavior

Using the data in the tables above:

  • Result does not contain ACME since it was last_checked_at within 7 days of today
  • Result does not contain MeepMeep even though last_checked_at is greater than 7 days, since the most recent transcript was published_at within 3 months of today
  • Result DOES contain TNT since last_checked_at is greater than 7 days and the most recent transcript was published_at greater than 3 months of today

Attempts

SELECT * FROM summaries s LEFT OUTER JOIN companies c ON s.company = c.id WHERE s.published_at < now() - INTERVAL '3 months' ORDER BY s.published_at ASC limit 1

>Solution :

We can try using DISTINCT ON on the transcripts table:

WITH cte AS (
    SELECT DISTINCT ON (company) *
    FROM summaries
    WHERE published_at < NOW() - INTERVAL '3 months'
    ORDER BY company, published_at DESC
)

SELECT c.name, c.last_checked_at, s.published_at
FROM companies c
INNER JOIN cte s
    ON s.company = c.id
WHERE c.last_checked_at < NOW() - INTERVAL '1 week';
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading