Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

R – Select all rows that have one NA value at most?

I’m trying to impute my data and keep as many observations as I can. I want to select observations that have 1 NA value at most from the data found at: mlbench::data(PimaIndiansDiabetes2).

For example:

Var1 Var2 Var3
1      NA   NA
2      34   NA
3      NA   NA
4      NA   55
5      NA   NA
6      40   28

What I would like returned:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Var1 Var2 Var3
2      34   NA
4      NA   55
6      40   28

This code returns rows with NA values and I know that I could join all observations with 1 NA value using merge() to observations without NA values. I’m not sure how to do extract those though.

na_rows <- df[!complete.cases(df), ]

>Solution :

A base R solution:

df[rowSums(is.na(df)) <= 1, ]

Its dplyr equivalent:

library(dplyr)

df %>%
  filter(rowSums(is.na(pick(everything()))) <= 1)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading