Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Filter Data frame if at least 3 columns agrees the condition

I have a data frame as this

df <- data.frame(student_name = c('U','V','X','Y','Z'),
             grade = c('AA','CC','DD','AB','BB'),
             math_marks = c(40,80,38,97,65),
             eng_marks = c(95,78,36,41,25),
             sci_marks = c(56,25,36,87,15),
             Point_A=c(1,1,1,1,NA),
             Point_B=c(NA,1,NA,1,1),
             Point_C=c(NA,1,NA,NA,NA),
             Point_D=c(NA,NA,NA,NA,1),
             Point_E=c(NA,1,NA,NA,1))

I need add a column called "Point" based on the column values Point_A to Point_E, if any 3 column value equals 1.

Excepted output.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df <- data.frame(student_name = c('U','V','X','Y','Z'),
             grade = c('AA','CC','DD','AB','BB'),
             math_marks = c(40,80,38,97,65),
             eng_marks = c(95,78,36,41,25),
             sci_marks = c(56,25,36,87,15),
             Point_A=c(1,1,1,1,NA),
             Point_B=c(NA,1,NA,1,1),
             Point_C=c(NA,1,NA,NA,NA),
             Point_D=c(NA,NA,NA,NA,1),
             Point_E=c(NA,1,NA,NA,1),
             Point=c(NA,1,NA,NA,1))

So far I was doing with this for all possible 3 combinations

df%>% filter(Point_A ==1,Point_B==1,Point_C==1)

Is there any other way to do this ?

>Solution :

To subset down to the rows with that condition use rowSums with across to sum the 1’s by row:

df %>% filter(rowSums(across(starts_with("Point")), na.rm = TRUE) >= 3)
##   student_name grade math_marks eng_marks sci_marks Point_A Point_B Point_C Point_D Point_E
## 1            V    CC         80        78        25       1       1       1      NA       1
## 2            Z    BB         65        25        15      NA       1      NA       1       1

or to add a 0/1 Point column indicating whether that row satisfies the condition:

df %>% mutate(Point = + (rowSums(across(starts_with("Point")), na.rm = TRUE) >= 3))
##   student_name grade math_marks eng_marks sci_marks Point_A Point_B Point_C Point_D Point_E Point
## 1            U    AA         40        95        56       1      NA      NA      NA      NA     0
## 2            V    CC         80        78        25       1       1       1      NA       1     1
## 3            X    DD         38        36        36       1      NA      NA      NA      NA     0
## 4            Y    AB         97        41        87       1       1      NA      NA      NA     0
## 5            Z    BB         65        25        15      NA       1      NA       1       1     1
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading