Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Error : This queryset contains a reference to an outer query and may only be used in a subquery

This might be a very easy to spot mistake but I am not the most familiar with the Django queries so I have been breaking my head for hours trying to solve that.

I have a Django Rest app and there I have 2 models:

class Student(models.Model):
    name = models.CharField(max_length=20)
    status = models.CharField(max_length=4, blank=True, default="UNK", choices=STATUS_TYPES)
    started_date = models.DateTimeField(null=False, auto_now_add=True)


class StudentStatusHistory(BaseTimestampModel):
    """Table of status changes for Students"""
    student = models.ForeignKey(
        "core.Student", on_delete=models.CASCADE, related_name="StatusHistory"
    )
    status = models.CharField(
        max_length=4, blank=True, default="UNK", choices=STATUS_TYPES
    )
    created_on = models.DateTimeField(null=False, auto_now_add=True)

For reference:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

STATUS_TYPES = [("UNK", "Unknown"), (“PASS", “Passed"), (“FAIL", “Failed")]

I am trying to write a query that calculates for how many students their status changed to Pass within the time window of a week or less.

So for example if a Student.started_date is 2023-07-15 12:22:22 and the StudentStatusHistory.created_on field is 2023-07-21 12:22:22, for a student with Student.status that equals PASS, I want to count this student because they passed in less than a week (in 6 days in this example).

Additionaly, I want to calculate this count only for students that started the current month.

What I have so far is:

current_month_start = datetime(datetime.today().year, datetime.today().month, 1)

# Subquery to filter StudentStatusHistory objects for each Student object within a week or less
pass_within_week_subquery = StudentStatusHistory.objects.filter(
            student_id=OuterRef(“pk"),
            status="PASS",
            created_on__range=(OuterRef("started_date"), OuterRef(“started_date") + timedelta(weeks=1))
        )

# Subquery to filter Student objects created within the current month
student_within_month_subquery = Student.objects.filter(
            started_date__gte=current_month_start,
            pk=OuterRef(“pk"),
        )

# Main queryset to annotate Student objects with pass_within_week count
student_objects_with_pass_within_week = Student.objects.annotate(
            pass_within_week_count=Subquery(
                pass_within_week_subquery.annotate(count_pass_within_week=Count("pk")).values(“count_pass_within_week"),
                output_field=IntegerField()),
            is_within_month=Exists(student_within_month_subquery),
        )

# Count the number of student objects that satisfy the condition
count_student_pass_within_week = student_objects_with_pass_within_week.filter(pass_within_week_count__gt=0,
is_within_month=True).count()

return Response(data= count_student_pass_within_week)

(I am not sure what’s up with the python format, but please ignore or edit that part)

Everything is running seemingly smoothly until the last line about the count that returns the error:

ValueError: This queryset contains a reference to an outer query and may only be used in a subquery.

I read a lot about the OuterRef, Subqueries, etc and I thought that by using them in a main queryset would solve this problem but clearly not.

Any ideas?

>Solution :

I think you can do this simpler with:

from datetime import timedelta

from django.db.models import F

Student.objects.filter(
    StatusHistory__status='PASS',
    StatusHistory__created_on__range=(
        F('started_date'),
        F('started_date') + timedelta(weeks=1),
    ),
).distinct().count()

we thus filter on the Students to only retain the ones with a related StatusHistory that has status='PASS', and the created_on in the time range of started_date and started_date and a week. We then count the distinct Students that match this. We need to use .distinct() [Django-doc] to prevent counting the student multiple times, if the status is set to PASS multiple times within that week.


Note: The related_name=… [Django-doc]
is the name of the manager to fetch the related objects in reverse. Therefore
normally the related_name of a ForeignKey or ManyToManyField is plural,
for example status_changes instead of StatusHistory.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading