R: Finding Out Which Students are Improving Their Grade

February 6, 2023

I am working with the R programming language.

Suppose I have the following dataset of student grades:

my_data = data.frame(id = c(1,1,1,1,1,2,2,2,3,3,3,3), year = c(2010,2011,2012,2013, 2014, 2008, 2009, 2010, 2018, 2019, 2020, 2021), grade = c(55, 56, 61, 61, 62, 90,89,89, 67, 87, 51, 65))

> my_data
   id year grade
1   1 2010    55
2   1 2011    56
3   1 2012    61
4   1 2013    61
5   1 2014    62
6   2 2008    90
7   2 2009    89
8   2 2010    89
9   3 2018    67
10  3 2019    87
11  3 2020    51
12  3 2021    65

My Question: I want to find out which students improved their grades (or kept the same grade) from year to year, and which students got worse grades from year to year.

Using the idea of "grouped window functions", I tried to write the following functions :

check_grades_improvement <- function(grades){
  for(i in 2:length(grades)){
    if(grades[i] < grades[i-1]){
      return(FALSE)
    }
  }
  return(TRUE)
}

check_grades_decline <- function(grades){
  for(i in 2:length(grades)){
    if(grades[i] > grades[i-1]){
      return(FALSE)
    }
  }
  return(TRUE)
}

Then, I tried to apply these functions to my dataset:

  improving_students <- my_data %>% group_by(id) %>% 
  filter(check_grades_improvement(grade)) %>% 
  select(id) %>% 
  unique()


worse_students <- my_data %>% 
  group_by(id) %>% 
  filter(check_grades_decline(grade)) %>% 
  select(id) %>% 
  unique()

But I am getting empty results

Can someone please show me what I am doing wrong and how I can fix this?

Thanks!

>Solution :

It seems that the code is missing an important step to apply the check_grades_improvement and check_grades_decline functions to the grade column for each group of students. You can use the summarize function of the dplyr package to apply the functions to the grade column for each group of students:

library(dplyr)

check_grades_improvement <- function(grades){
  for(i in 2:length(grades)){
    if(grades[i] < grades[i-1]){
      return(FALSE)
    }
  }
  return(TRUE)
}

check_grades_decline <- function(grades){
  for(i in 2:length(grades)){
    if(grades[i] > grades[i-1]){
      return(FALSE)
    }
  }
  return(TRUE)
}

improving_students <- my_data %>% 
  group_by(id) %>% 
  summarize(improvement = check_grades_improvement(grade)) %>% 
  filter(improvement == TRUE) %>% 
  select(id) %>% 
  unique()

worse_students <- my_data %>% 
  group_by(id) %>% 
  summarize(decline = check_grades_decline(grade)) %>% 
  filter(decline == TRUE) %>% 
  select(id) %>% 
  unique()

This code should give you the correct results of the students who improved their grades and those who got worse grades.