R: Problem with specifying column using data.table v1.14.8

Advertisements

I have a problem with specifying column using data.table v1.14.8. Below are the examples of what doesn’t work and what works. Using the original dataset, which I want to make work, doesn’t work.

What doesn’t work (original dataset)

  vars.row <- c("category", "subcategory", "variable")
  
  nrows <- length(vars.row)
  
  
  d <-
    structure(list(category = c("", "\\\\[-6pt]Scheduling", "\\\\[-6pt]Scheduling", 
                              "\\\\[-6pt]Scheduling", "\\\\[-6pt]Scheduling", "\\\\[-6pt]Statistics", 
                              "\\\\[-6pt]Statistics", "\\\\[-6pt]Statistics", "\\\\[-6pt]Statistics", 
                              "\\\\[-6pt]Statistics", "\\\\[-6pt]Statistics"), 
                   subcategory = c("", 
                               "New Procedures", "New Procedures", "New Partners", "New Partners", 
                               "", "", "", "", "", ""), 
                   variable = c("(Intercept)", "New procedures", 
                                 "(New procedures)$^2$", "New partners", "(New partners)$^2$", 
                                 "Adj. R$^2$", "AIC", "AICc", "BIC", "Deviance", "N")
                                 ), 
              row.names = c(NA, -11L), 
              class = c("data.table", "data.frame")
              )   # .internal.selfref = <pointer: 0x1410254e0>

  for(i in 1:nrows){
    var <- vars.row[i]
    for(j in nrow(d):2){
      if(d[j, ..var] == d[j-1, ..var]) d[j, ..var] <- ''
    }
  }
  
  

What works (fake dataset)

vars.row <- c("category", "subcategory")
  
  nrows <- length(vars.row)
  
  d <-
    data.frame(
      category = c(rep('A', 5), rep('B', 5)),
      subcategory = c(rep('a', 3), rep('b', 3), rep('c', 4))
    )
  
  for(i in 1:nrows){
    var <- vars.row[i]
    for(j in nrow(d):2){
      if(d[j, ..var] == d[j-1, ..var]) d[j, ..var] <- ''
    }
  }
  

>Solution :

Not sure about the bug, but we can do what I think you’re trying to do with:

for (var in vars.row) d[get(var) == shift(get(var)), c(var) := ""]
d
#                 category    subcategory             variable
#                   <char>         <char>               <char>
#  1:                                              (Intercept)
#  2: \\\\[-6pt]Scheduling New Procedures       New procedures
#  3:                                     (New procedures)$^2$
#  4:                        New Partners         New partners
#  5:                                       (New partners)$^2$
#  6: \\\\[-6pt]Statistics                          Adj. R$^2$
#  7:                                                      AIC
#  8:                                                     AICc
#  9:                                                      BIC
# 10:                                                 Deviance
# 11:                                                        N

or without the for loop:

d[, c(vars.row) := lapply(.SD, function(z) { z[z == shift(z)] <- ""; z; }), 
  .SDcols = vars.row]

Leave a ReplyCancel reply