Advertisements
I have a problem with specifying column using data.table v1.14.8. Below are the examples of what doesn’t work and what works. Using the original dataset, which I want to make work, doesn’t work.
What doesn’t work (original dataset)
vars.row <- c("category", "subcategory", "variable")
nrows <- length(vars.row)
d <-
structure(list(category = c("", "\\\\[-6pt]Scheduling", "\\\\[-6pt]Scheduling",
"\\\\[-6pt]Scheduling", "\\\\[-6pt]Scheduling", "\\\\[-6pt]Statistics",
"\\\\[-6pt]Statistics", "\\\\[-6pt]Statistics", "\\\\[-6pt]Statistics",
"\\\\[-6pt]Statistics", "\\\\[-6pt]Statistics"),
subcategory = c("",
"New Procedures", "New Procedures", "New Partners", "New Partners",
"", "", "", "", "", ""),
variable = c("(Intercept)", "New procedures",
"(New procedures)$^2$", "New partners", "(New partners)$^2$",
"Adj. R$^2$", "AIC", "AICc", "BIC", "Deviance", "N")
),
row.names = c(NA, -11L),
class = c("data.table", "data.frame")
) # .internal.selfref = <pointer: 0x1410254e0>
for(i in 1:nrows){
var <- vars.row[i]
for(j in nrow(d):2){
if(d[j, ..var] == d[j-1, ..var]) d[j, ..var] <- ''
}
}
What works (fake dataset)
vars.row <- c("category", "subcategory")
nrows <- length(vars.row)
d <-
data.frame(
category = c(rep('A', 5), rep('B', 5)),
subcategory = c(rep('a', 3), rep('b', 3), rep('c', 4))
)
for(i in 1:nrows){
var <- vars.row[i]
for(j in nrow(d):2){
if(d[j, ..var] == d[j-1, ..var]) d[j, ..var] <- ''
}
}
>Solution :
Not sure about the bug, but we can do what I think you’re trying to do with:
for (var in vars.row) d[get(var) == shift(get(var)), c(var) := ""]
d
# category subcategory variable
# <char> <char> <char>
# 1: (Intercept)
# 2: \\\\[-6pt]Scheduling New Procedures New procedures
# 3: (New procedures)$^2$
# 4: New Partners New partners
# 5: (New partners)$^2$
# 6: \\\\[-6pt]Statistics Adj. R$^2$
# 7: AIC
# 8: AICc
# 9: BIC
# 10: Deviance
# 11: N
or without the for
loop:
d[, c(vars.row) := lapply(.SD, function(z) { z[z == shift(z)] <- ""; z; }),
.SDcols = vars.row]