Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Why mutate + across in dplyr create columns with "[,1]" at the end?

See code below.

the mutate(across(everything(), scale, .names = "{.col}_z")) part of the syntax is generating columns with [,1]appended at the end.

Two questions:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

  1. Why is this happening?
  2. How can I avoid or remove it?
library(dplyr)

# Input
df_test <- tibble(x = c(1, 2, 3, 4), y = c(5, 6, 7, 8))

# My code generating x_z and y_z
df_scaled <- df_test %>% 
  mutate(across(everything(), scale, .names = "{.col}_z"))

# Output
df_scaled
#> # A tibble: 4 × 4
#>       x     y x_z[,1] y_z[,1]
#>   <dbl> <dbl>   <dbl>   <dbl>
#> 1     1     5  -1.16   -1.16 
#> 2     2     6  -0.387  -0.387
#> 3     3     7   0.387   0.387
#> 4     4     8   1.16    1.16

Expected output

#> # A tibble: 4 × 4
#>       x     y     x_z     y_z
#>   <dbl> <dbl>   <dbl>   <dbl>
#> 1     1     5  -1.16   -1.16 
#> 2     2     6  -0.387  -0.387
#> 3     3     7   0.387   0.387
#> 4     4     8   1.16    1.16

Created on 2022-12-30 with reprex v2.0.2

>Solution :

scale returns a matrix. We may either use c or extract the column with [ or use as.numeric to remove the dim attributes

library(dplyr)
df_test %>% 
  mutate(across(everything(),
     ~ as.numeric(scale(.x)), .names = "{.col}_z"))

-output

# A tibble: 4 × 4
      x     y    x_z    y_z
  <dbl> <dbl>  <dbl>  <dbl>
1     1     5 -1.16  -1.16 
2     2     6 -0.387 -0.387
3     3     7  0.387  0.387
4     4     8  1.16   1.16 

i.e. check the output on a single column

> scale(df_test[[1]])
           [,1]
[1,] -1.1618950
[2,] -0.3872983
[3,]  0.3872983
[4,]  1.1618950
attr(,"scaled:center")
[1] 2.5
attr(,"scaled:scale")
[1] 1.290994

If we check the source code

> scale.default
function (x, center = TRUE, scale = TRUE) 
{
    x <- as.matrix(x) # it is converting to matrix
...

and is required in applying apply/colMeans/sweep, thus when we pass a vector to the scale, it does convert it to a single column matrix

> as.matrix(df_test$x)
     [,1]
[1,]    1
[2,]    2
[3,]    3
[4,]    4
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading