Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

In R, how do I make a long dataset that contains all the categories that a row applies to?

I have a dataset of students, and I want to create a column that categorizes each student based on their and other students’ data in their "subject" and "grade_letter" column because I eventually want to create a column that says "math_science_A", "science_B", etc. to know how many students got As, Bs, Cs, etc. based on the unique make ups.

I have a list of all potential combinations that a student can have, but when I try to iterate over R, my case_when only gives me the first category. I then tried to make multiple columns, so I could pivot longer later, but that also seemed to fail. Any help would be appreciated!

#Here's my sample dataset
library(tidyverse)
test <- tibble(student_id = c(1),
               subject = c("math"),
               grade_num = c(95),
               grade = c("a"))



#Here's a sample of what I want (full desired output would be too long to write; see text below code chunks)
sample_of_desired_output <- tibble(student_id = c(1),
               subject = c("math"),
               grade_num = c(95),
               grade = c("a"),
category = c("n_math_a", "n_science_a", "n_english_a", "n_math_science_a", "n_math_english_a"))

#####Here's what I tried
#Getting all combinations--this part works
all_subjects <- c("math", "science", "english")

i = 1
while(i <= length(all_subjects)) {
  
  if(i == 1) {
    all_subject_combos <- c()
  } else {
    some_combos <- combn(all_subjects, i,  FUN = function(x) paste(x, collapse = "_"), simplify = TRUE)
    all_subject_combos <- append(all_subject_combos, some_combos)
  }
  
  i <- i +1
}

all_grades <- c("A", "B", "C", "D", "F")

all_subjects_and_grades <- expand.grid(all_subject_combos, all_grades)

subjects_and_grades_combos <- all_subjects_and_grades %>%
  unite("names", c(Var1, Var2)) %>%
  mutate(names =  tolower(paste0("n_", str_replace_all(names, "\\|", "_")))) %>%
  pull(names)


#This part does not work but is what I tried:

i = 1
while(i <= length(subjects_and_grades_combos)) {
  does_not_work <- test %>%
    mutate("subject_grade_grouping_{{i}}" := case_when(
      str_detect(subjects_and_grades_combos[i], subject) ~ subjects_and_grades_combos[i]))
  i <- i +1
}
does_not_work

Note that in the final, there will be all potential combinations, so each student will have their data repeating many times. After I have this, I will be using this to calculate the number of students who fall into each category. So, in this example, math_a, math_science_a, math_english_a, and math_science_english_a would all have 1. Note that I do not need help calculating that part, and I am flexible with the order of the columns.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

The while loops needs to update the original object or else it gets the last iteration update

library(dplyr)
library(tidyr)
test1 <- test
 i = 1
while(i <= length(subjects_and_grades_combos)) {
  test1 <- test1 %>%
     mutate("subject_grade_grouping_{{i}}" := case_when(
       str_detect(subjects_and_grades_combos[i], subject) ~ subjects_and_grades_combos[i]))
   i <- i +1
 }

test1 %>% 
  pivot_longer(cols = starts_with('subject_grade'),
   names_to = NULL, values_to = "category", values_drop_na = TRUE)

-output

# A tibble: 15 × 5
   student_id subject grade_num grade category                
        <dbl> <chr>       <dbl> <chr> <chr>                   
 1          1 math           95 a     n_math_science_a        
 2          1 math           95 a     n_math_english_a        
 3          1 math           95 a     n_math_science_english_a
 4          1 math           95 a     n_math_science_b        
 5          1 math           95 a     n_math_english_b        
 6          1 math           95 a     n_math_science_english_b
 7          1 math           95 a     n_math_science_c        
 8          1 math           95 a     n_math_english_c        
 9          1 math           95 a     n_math_science_english_c
10          1 math           95 a     n_math_science_d        
11          1 math           95 a     n_math_english_d        
12          1 math           95 a     n_math_science_english_d
13          1 math           95 a     n_math_science_f        
14          1 math           95 a     n_math_english_f        
15          1 math           95 a     n_math_science_english_f
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading