Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

how to filter only the rows containing first occurrence of an item in Django query

I have below data from my Django model

id  Date  value
0   1975     a
21  1975     b
1   1976     b
22  1976     c
3   1977     a
2   1977     b
4   1978     c
25  1978     d
5   1979     e
26  1979     f
6   1980     a
27  1980     f

Model

class Items(models.Model):
    date = models.DateField()
    value = models.CharField()

I am having trouble finding a way to keep only the lines containing the first occurrence of a’value’. I want to drop duplicate ‘values’, keeping the row with the lowest ‘Date’.The end result should be:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

id   Date value
0   1975     a
21  1975     b
22  1976     c
25  1978     d
5   1979     e
26  1979     f

>Solution :

You can filter by eliminating all items for which there is an earlier record:

from django.db.models import Exists, OuterRef

Items.objects.filter(
    ~Exists(Items.objects.filter(
        Q(date__lt=OuterRef('date')) | Q(Date=OuterRef('date'), pk__lt=OuterRef('pk')),
        value=OuterRef('value')
    ))
)

We thus check if there is a record with the same value where the date is smaller than our current record, or where the date is the same but the primary key is smaller (as tie breaker).


Note: normally a Django model is given a singular name, so Items instead of Item.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading