Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

group_by and case_when() function for multiple conditions

I’m struggling with a problem in R. I want to create a new variable (qc) by group_by the variable (NAME and PLOT) using case_when for where "EH” > “PH” then give me B else give me Q……

I have a data set like this:

  df <- tibble(
    NAMEOFEXPERIMENT= c("A","A","A","A","A","A","A","B","B","B","B","B","B","B","B"),
    PLOT= c(2,1,2,1,2,1,2,1,2,1,2,1,2,1,2),
    trait= c("EH","NP","NP","PH","PH","PL","PL","EH","EH","NP","NP","PH","PH","PL","PL"),
    traitValue= c(125,36,36,240,"NA",36,36,90,110,35,33,215,190,36,31)
    )   

 # A tibble: 15 x 4
  NAME  PLOT trait traitValue
  <chr>            <dbl> <chr> <chr>     
  1 A                    2 EH    250       
  2 A                    1 NP    36        
  3 A                    2 NP    36        
  4 A                    1 PH    240       
  5 A                    2 PH    200        
  6 A                    1 PL    36        
  7 A                    2 PL    36        
  8 B                    1 EH    90        
  9 B                    2 EH    110       
 10 B                    1 NP    35        
 11 B                    2 NP    33        
 12 B                    1 PH    215       
 13 B                    2 PH    190       
 14 B                    1 PL    36        
 15 B                    2 PL    31  

This is what I want to achieve: If “EH” > “PH” then give me B else give me Q
If “PL” > “NP” then give me B else give me Q

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Thus, line qc line 4 to be empty since there is no NAME "A", PLOT 1, Trait "EH" to compare with

   # A tibble: 15 x 4
   NAME  PLOT trait traitValue dc
    <chr>            <dbl> <chr> <chr>     <chr>
  1 A                    2 EH    250       B
  2 A                    1 NP    36        Q
  3 A                    2 NP    36        Q
  4 A                    1 PH    240       
  5 A                    2 PH    200       B
  6 A                    1 PL    36        Q
  7 A                    2 PL    36        Q
  8 B                    1 EH    90        Q
  9 B                    2 EH    110       Q
 10 B                    1 NP    35        B
 11 B                    2 NP    33        Q
 12 B                    1 PH    215       Q
 13 B                    2 PH    190       Q
 14 B                    1 PL    36        B
 15 B                    2 PL    31        Q

When I run this code

 dt2 <- df %>%
   group_by(NAME, PLOT) %>%
           traitValue[trait == "EH"] > traitValue[trait == "PH"] ~ "B",
           traitValue[trait == "EH"] < traitValue[trait == "PH"] ~ "Q",
           traitValue[trait == "PL"] > traitValue[trait == "NP"] ~ "B",
           traitValue[trait == "PL"] < traitValue[trait == "NP"] ~ "Q"
           ))

I got this Error

 Error in `mutate()`:
 ! Problem while computing `data_qc = case_when(...)`.
  i The error occurred in group 1: NAME = "A", PLOT = 1.
 Caused by error in`case_when()`: 
 ! `traitValue[trait == "EH"] > traitValue[trait == "PH"] ~ "B"`, traitValue[trait == "EH"] < traitValue[trait == "PH"] ~ "Q"`
 must be length 3 or one, not 0.

>Solution :

I don’t fully understand your constraints. You did not specify what would happen if "PH" > "EH" and "PL" > "NP" at the same time. In this case, will the final outcome be "B" or "Q".

However, to get you started I wrote the following code:

## Loading the required libraries
library(dplyr)
library(tidyverse)

## Creating the dataframe
df <- data.frame(
  NAMEOFEXPERIMENT= c("A","A","A","A","A","A","A","B","B","B","B","B","B","B","B"),
  PLOT= c(2,1,2,1,2,1,2,1,2,1,2,1,2,1,2),
  trait= c("EH","NP","NP","PH","PH","PL","PL","EH","EH","NP","NP","PH","PH","PL","PL"),
  traitValue= c(125,36,36,240,200,36,36,90,110,35,33,215,190,36,31)
)  

## Removing duplicates
unique(df)

## Pivot longer to wider
df %>%
  pivot_wider(names_from = trait, values_from = traitValue) %>%
  arrange(NAMEOFEXPERIMENT,PLOT) %>%
  mutate(ConditionalValue1 = ifelse(EH>PH,"B", "Q"),
         ConditionalValue2 = ifelse(PL>NP,"B", "Q"))

Output

# A tibble: 4 x 8
  NAMEOFEXPERIMENT  PLOT    EH    NP    PH    PL ConditionalValue1 ConditionalValue2
  <chr>            <dbl> <dbl> <dbl> <dbl> <dbl> <chr>             <chr>            
1 A                    1    NA    36   240    36 NA                Q                
2 A                    2   125    36   200    36 Q                 Q                
3 B                    1    90    35   215    36 Q                 B                
4 B                    2   110    33   190    31 Q                 Q                

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading