Error : This queryset contains a reference to an outer query and may only be used in a subquery

Advertisements

This might be a very easy to spot mistake but I am not the most familiar with the Django queries so I have been breaking my head for hours trying to solve that.

I have a Django Rest app and there I have 2 models:

class Student(models.Model):
    name = models.CharField(max_length=20)
    status = models.CharField(max_length=4, blank=True, default="UNK", choices=STATUS_TYPES)
    started_date = models.DateTimeField(null=False, auto_now_add=True)


class StudentStatusHistory(BaseTimestampModel):
    """Table of status changes for Students"""
    student = models.ForeignKey(
        "core.Student", on_delete=models.CASCADE, related_name="StatusHistory"
    )
    status = models.CharField(
        max_length=4, blank=True, default="UNK", choices=STATUS_TYPES
    )
    created_on = models.DateTimeField(null=False, auto_now_add=True)

For reference:

STATUS_TYPES = [("UNK", "Unknown"), (“PASS", “Passed"), (“FAIL", “Failed")]

I am trying to write a query that calculates for how many students their status changed to Pass within the time window of a week or less.

So for example if a Student.started_date is 2023-07-15 12:22:22 and the StudentStatusHistory.created_on field is 2023-07-21 12:22:22, for a student with Student.status that equals PASS, I want to count this student because they passed in less than a week (in 6 days in this example).

Additionaly, I want to calculate this count only for students that started the current month.

What I have so far is:

current_month_start = datetime(datetime.today().year, datetime.today().month, 1)

# Subquery to filter StudentStatusHistory objects for each Student object within a week or less
pass_within_week_subquery = StudentStatusHistory.objects.filter(
            student_id=OuterRef(“pk"),
            status="PASS",
            created_on__range=(OuterRef("started_date"), OuterRef(“started_date") + timedelta(weeks=1))
        )

# Subquery to filter Student objects created within the current month
student_within_month_subquery = Student.objects.filter(
            started_date__gte=current_month_start,
            pk=OuterRef(“pk"),
        )

# Main queryset to annotate Student objects with pass_within_week count
student_objects_with_pass_within_week = Student.objects.annotate(
            pass_within_week_count=Subquery(
                pass_within_week_subquery.annotate(count_pass_within_week=Count("pk")).values(“count_pass_within_week"),
                output_field=IntegerField()),
            is_within_month=Exists(student_within_month_subquery),
        )

# Count the number of student objects that satisfy the condition
count_student_pass_within_week = student_objects_with_pass_within_week.filter(pass_within_week_count__gt=0,
is_within_month=True).count()

return Response(data= count_student_pass_within_week)

(I am not sure what’s up with the python format, but please ignore or edit that part)

Everything is running seemingly smoothly until the last line about the count that returns the error:

ValueError: This queryset contains a reference to an outer query and may only be used in a subquery.

I read a lot about the OuterRef, Subqueries, etc and I thought that by using them in a main queryset would solve this problem but clearly not.

Any ideas?

>Solution :

I think you can do this simpler with:

from datetime import timedelta

from django.db.models import F

Student.objects.filter(
    StatusHistory__status='PASS',
    StatusHistory__created_on__range=(
        F('started_date'),
        F('started_date') + timedelta(weeks=1),
    ),
).distinct().count()

we thus filter on the Students to only retain the ones with a related StatusHistory that has status='PASS', and the created_on in the time range of started_date and started_date and a week. We then count the distinct Students that match this. We need to use .distinct() [Django-doc] to prevent counting the student multiple times, if the status is set to PASS multiple times within that week.


Note: The related_name=… [Django-doc]
is the name of the manager to fetch the related objects in reverse. Therefore
normally the related_name of a ForeignKey or ManyToManyField is plural,
for example status_changes instead of StatusHistory.

Leave a ReplyCancel reply