Remove NA from a dataframe column R

I have a dataframe named Resultaat

Cluster Number
W63     1020 NA NA NA 1100
W50     1020 NA 1240 NA NA

I want to remove all the NA values en keep the numbers. The columns are defined as character.

Expected output

Cluster Number
W63     1020 1100
W50     1020 1240 

I tried things like
gsub("^NA(?:\\s+NA)*\\b\\s*|\\s*\\bNA(?:\\s+NA)*$", "", Resultaat$Number)
& Resultaat <- Resultaat[!is.na(Resultaat)] but nothing works

>Solution :

Here is one option – read the column ‘Number’ with read.table and unite all the columns, excluding the NA elements with na.rm = TRUE

library(tidyr)
library(dplyr)
read.table(text = Resultaat$Number, header = FALSE, fill = TRUE) %>% 
  unite(Number, everything(), na.rm = TRUE, sep = " ") %>% 
  bind_cols(Resultaat[1], .)

-output

Cluster    Number
1     W63 1020 1100
2     W50 1020 1240

Regarding the gsub, it can be

gsub("\\s+NA|NA\\s+|NA$|^NA", "", Resultaat$Number)
[1] "1020 1100" "1020 1240"

Or may also use tidvyerse methods as

library(dplyr)
library(tidyr)
library(stringr)
Resultaat %>%
   separate_rows(Number) %>% 
   na_if("NA") %>%
   drop_na() %>%
   group_by(Cluster) %>%
   summarise(Number = str_c(Number, collapse = " "))

-output

# A tibble: 2 × 2
  Cluster Number   
  <chr>   <chr>    
1 W50     1020 1240
2 W63     1020 1100

data

Resultaat <- structure(list(Cluster = c("W63", "W50"), 
Number = c("1020 NA NA NA 1100", 
"1020 NA 1240 NA NA")), class = "data.frame", row.names = c(NA, 
-2L))

Leave a Reply