R dplyr replace strings with only spaces, for all columns

the key issue is i need it dynamic, as I many not know which columns are empty (especially if new ones are added).

here is my current attempt based on code from here:

singleAction <- tibble::tibble(
  datePublished = as.Date("2022/01/01"),
  title = "",
  summary = "   ",
  createdby = "bob",
  customer = "james",
  contactAnalyst = "jack",
  type = "fighting",
  status = "draft",
  draftLink = "            ",
  finalLink = " ",
  invalidated = FALSE,
  number = 4
  
) 
# idea 1, in two steps >
singleAction  <- singleAction %>%
      mutate_if(is.character, map(stringr::str_remove_all(., " "))) %>%
      mutate_if(is.character, mutate(across(.fns = ~replace(., . ==  "" , NA))))

# idea 2 
singleAction <- mutate_if(is.character, replace(., grepl("^\\s*$", .) == TRUE, NA))


print(singleAction)

this code works for a single variable but a table/column:

replaceEmpty <- function(x){
  if(nchar(x) == 0 || stringr::str_remove_all(x, " ") == ""){
    return(NA)
  }
  return(x)
}

>Solution :

str_squish from the stringr package removes whitespace at the start and end, and replaces all internal whitespace with a single space. This makes it easy to test for an empty string and convert to NA:

library(dplyr)
library(stringr)

singleAction |>
  mutate(across(where(is.character), ~ ifelse(str_squish(.) == "", NA, .)))

Output

  datePublished title summary createdby customer contactAnalyst type     status
  <date>        <lgl> <lgl>   <chr>     <chr>    <chr>          <chr>    <chr> 
1 2022-01-01    NA    NA      bob       james    jack           fighting draft 
# ℹ 4 more variables: draftLink <lgl>, finalLink <lgl>, invalidated <lgl>,

Leave a Reply