Clearing all values in a dataset while retaining attributes using tidyverse

Advertisements

I need to create a blank version of a dataset, to clear all the values while preserving the columm names and, importantly, the classes of the variables.

Here’s some toy data, three different variables with three different attributes

df <- data.frame(x = rnorm(5),
                 y = factor(letters[5:1]),
                 z = c(1:2, NA, 4:5))

glimpse(df)

Rows: 5
Columns: 3
$ x <dbl> -0.24530142, -0.05332072, 0.12387791, -0.26148671, -0.53779766
$ y <fct> e, d, c, b, a
$ z <int> 1, 2, NA, 4, 5

Now when I try to clear the values using mutate and across in dplyr

df %>%
  mutate(across(everything(),
                ~ NA)) -> blankDF

blankDF

   x  y  z
1 NA NA NA
2 NA NA NA
3 NA NA NA
4 NA NA NA
5 NA NA NA

Looks good, but

glimpse(blankDF)

# Rows: 5
# Columns: 3
# $ x <lgl> NA, NA, NA, NA, NA
# $ y <lgl> NA, NA, NA, NA, NA
# $ z <lgl> NA, NA, NA, NA, NA

It has stripped the attributes of all the variables, turning them logical.

Can someone give me advice on how to get the blank dataset while retaining the attributes?

A tidyverse solution would be nice, but any solutions appreciated.

>Solution :

You could replace all values across the columns by replacing the columns .x with NA using na_if like this:

library(dplyr)

glimpse(df)
#> Rows: 5
#> Columns: 3
#> $ x <dbl> -0.2006935, 1.3461746, -0.1433400, -0.8983886, -0.3190282
#> $ y <fct> e, d, c, b, a
#> $ z <int> 1, 2, NA, 4, 5

df_output = df %>% 
  mutate(across(everything(), ~ na_if(.x, .x)))

glimpse(df_output)
#> Rows: 5
#> Columns: 3
#> $ x <dbl> NA, NA, NA, NA, NA
#> $ y <fct> NA, NA, NA, NA, NA
#> $ z <int> NA, NA, NA, NA, NA

Created on 2023-07-07 with reprex v2.0.2

Leave a ReplyCancel reply