I have below data from my Django model
id Date value
0 1975 a
21 1975 b
1 1976 b
22 1976 c
3 1977 a
2 1977 b
4 1978 c
25 1978 d
5 1979 e
26 1979 f
6 1980 a
27 1980 f
Model
class Items(models.Model):
date = models.DateField()
value = models.CharField()
I am having trouble finding a way to keep only the lines containing the first occurrence of a’value’. I want to drop duplicate ‘values’, keeping the row with the lowest ‘Date’.The end result should be:
id Date value
0 1975 a
21 1975 b
22 1976 c
25 1978 d
5 1979 e
26 1979 f
>Solution :
You can filter by eliminating all items for which there is an earlier record:
from django.db.models import Exists, OuterRef
Items.objects.filter(
~Exists(Items.objects.filter(
Q(date__lt=OuterRef('date')) | Q(Date=OuterRef('date'), pk__lt=OuterRef('pk')),
value=OuterRef('value')
))
)
We thus check if there is a record with the same value where the date is smaller than our current record, or where the date is the same but the primary key is smaller (as tie breaker).
Note: normally a Django model is given a singular name, so
Itemsinstead of.Item