Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

R: Calculating Percentiles For Multiple Groups

I am working with the R programming language.

I have the following dataset:

set.seed(123)

library(dplyr)
var1 = rnorm(10000, 100,100)
var2 = rnorm(10000, 100,100)
var3 = rnorm(10000, 100,100)
var4 = rnorm(10000, 100,100)
id = 1:10000

final = data.frame(id, var1, var2, var3, var4)

final = final %>%
  mutate(class1 = case_when(var1 < mean(var1) ~ "A",
                             TRUE ~ "B")) %>% 
mutate(class2 = case_when(var2 < mean(var2) ~ "C",
                             TRUE ~ "D"))

I want to calculate deciles for var3 and var4 based on every unique combination of class1 and class2.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

As I understand, this means:

  • For all rows WHERE class1 = A AND class2 = C, calculate/assign deciles for var3 and var4
  • For all rows WHERE class1 = A AND class2 = D, calculate/assign deciles for var3 and var4
  • For all rows WHERE class1 = B AND class2 = C, calculate/assign deciles for var3 and var4
  • For all rows WHERE class1 = B AND class2 = D, calculate/assign deciles for var3 and var4

Here is the R code I wrote for this:

final = final %>%
group_by(class1, class2) %>%
  mutate(class3 = case_when(ntile(var3, 10) == 1 ~ "one",
                             ntile(var3, 10) == 2 ~ "two",
                             ntile(var3, 10) == 3 ~ "three",
                             ntile(var3, 10) == 4 ~ "four",
                             ntile(var3, 10) == 5 ~ "five",
                             ntile(var3, 10) == 6 ~ "six",
                             ntile(var3, 10) == 7 ~ "seven",
                             ntile(var3, 10) == 8 ~ "eight",
                             ntile(var3, 10) == 9 ~ "nine",
                             ntile(var3, 10) == 10 ~ "ten")) %>%
  mutate(class4 = case_when(ntile(var4, 10) == 1 ~ "one",
                             ntile(var4, 10) == 2 ~ "two",
                             ntile(var4, 10) == 3 ~ "three",
                             ntile(var4, 10) == 4 ~ "four",
                             ntile(var4, 10) == 5 ~ "five",
                             ntile(var4, 10) == 6 ~ "six",
                             ntile(var4, 10) == 7 ~ "seven",
                             ntile(var4, 10) == 8 ~ "eight",
                             ntile(var4, 10) == 9 ~ "nine",
                             ntile(var4, 10) == 10 ~ "ten"))

Can someone please tell me if I have done this correctly?

Thanks!

>Solution :

Instead of doing the case_when it can be done easily with english

library(dplyr)
library(stringr)
final %>%
   group_by(class1, class2) %>% 
   mutate(across(var3:var4, 
         ~ as.character(english::english(ntile(.x, 10))),
       .names = "{str_replace(.col, 'var', 'class')}")) %>% 
   ungroup

-output

# A tibble: 10,000 × 9
      id  var1  var2    var3  var4 class1 class2 class3 class4
   <int> <dbl> <dbl>   <dbl> <dbl> <chr>  <chr>  <chr>  <chr> 
 1     1  44.0 337.    16.4   80.6 A      D      three  five  
 2     2  77.0  83.3   77.9  126.  A      C      five   six   
 3     3 256.  193.  -110.    46.2 B      D      one    four  
 4     4 107.   43.2  -66.8  -17.9 B      C      one    two   
 5     5 113.  123.    -9.80 190.  B      D      two    nine  
 6     6 272.  213.   -66.6   98.4 B      D      one    six   
 7     7 146.  238.    95.0  118.  B      D      five   six   
 8     8 -26.5  76.7  256.   160.  A      C      ten    eight 
 9     9  31.3 -60.1   59.5  126.  A      C      four   six   
10    10  55.4  70.2  179.   130.  A      C      eight  seven 
# … with 9,990 more rows
# ℹ Use `print(n = ...)` to see more rows
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading