Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Get name of column inside .SD call in data.table

I am trying to use the name of the variables in .SD but I can’t manage to get it. In the toy example below, I need to concatenate the suffix " by {z}" to any cell in the table that has an "a". The {z} part stands for the name of the variable, and I need to do it for all variables. See below the input table and the desired output table.

library(data.table)
# Input 
ip <- data.table(x = c("ab", "cd", "ac", "de"),
                 y = c("fr", "ad", "fa", "we"))

ip[]
#>     x  y
#> 1: ab fr
#> 2: cd ad
#> 3: ac fa
#> 4: de we

# Desired Output table

op <- data.table(x = c("ab b x", "cd", "ac by x", "de"),
                 y = c("fr", "ad by y", "fa by y", "we"))
op[]
#>          x       y
#> 1:  ab b x      fr
#> 2:      cd ad by y
#> 3: ac by x fa by y
#> 4:      de      we

One way that I thought could work is to use deparse(substitute(x)) as in the example below.

add_if_pattern <- function(x, pattern) {
  y <- deparse(substitute(x))
  fifelse(test = grepl(pattern, x),
          paste(x, "by",  y),
          x)
}

pattern <- "a"
z <- "blah"
q <- "bleh"
add_if_pattern(z, pattern) ## add the pattern
#> [1] "blah by z"
add_if_pattern(q, pattern) ## does not add the pattern
#> [1] "bleh"

However, when I include that function into a lapply(.SD) in data.table it does something unexpected.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

tp <- copy(ip)
ip <- copy(tp)

vars <- names(ip)
ip[, (vars) := lapply(.SD,add_if_pattern, pattern)]
ip[]
#>               x            y
#> 1: ab by X[[i]]           fr
#> 2:           cd ad by X[[i]]
#> 3: ac by X[[i]] fa by X[[i]]
#> 4:           de           we

I don’t need X[[i]], but the names of the original variables, either x or y. I also tried using names(.SD) but it seems that it is outside of the scope and thus got an error (see below). Could you please give a hand?

Thanks.

ip <- copy(tp)
ip[, (vars) := lapply(.SD,
                      \(x){
                        fifelse(test = grepl("classified", x),
                                paste(x, "by",  names(.SD)[..x]),
                                x)
                      })]
#> Error in `[.data.table`(ip, , `:=`((vars), lapply(.SD, function(x) {: Variable 'x' is not found in calling scope. Looking in calling scope because this symbol was prefixed with .. in the j= parameter.

Created on 2022-08-30 with reprex v2.0.2

>Solution :

Consider passing an argument for column name and then use Map

add_if_pattern <- function(x, pattern, colnm) {
   y <- colnm
   fifelse(test = grepl(pattern, x),
           paste(x, "by",  y),
           x)
 }

-testing

ip[, (vars) := Map(function(x, nm)
   add_if_pattern(x, pattern, nm), .SD, names(.SD)), .SDcols = vars] 

-output

> ip
         x       y
    <char>  <char>
1: ab by x      fr
2:      cd ad by y
3: ac by x fa by y
4:      de      we
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading