R – Select all rows that have one NA value at most?

I’m trying to impute my data and keep as many observations as I can. I want to select observations that have 1 NA value at most from the data found at: mlbench::data(PimaIndiansDiabetes2).

For example:

Var1 Var2 Var3
1      NA   NA
2      34   NA
3      NA   NA
4      NA   55
5      NA   NA
6      40   28

What I would like returned:

Var1 Var2 Var3
2      34   NA
4      NA   55
6      40   28

This code returns rows with NA values and I know that I could join all observations with 1 NA value using merge() to observations without NA values. I’m not sure how to do extract those though.

na_rows <- df[!complete.cases(df), ]

>Solution :

A base R solution:

df[rowSums(is.na(df)) <= 1, ]

Its dplyr equivalent:

library(dplyr)

df %>%
  filter(rowSums(is.na(pick(everything()))) <= 1)

Leave a Reply