In the toy example below, I want to delete all rows that have Inf or Nan values. In my actual data.table, there are much more columns.
Group<-c("A","B","C","D","E","F","G")
LRR <- c(Inf, 1,2,3,-Inf,4, 5)
LRR.var <- c(NaN, Inf, 3, -Inf, -Inf, 6,7)
data<-data.table(cbind(Group, LRR, LRR.var))
data
Group LRR LRR.var
A Inf NaN
B 1 Inf
C 2 3
D 3 -Inf
E -Inf -Inf
F 4 6
G 5 7
To delete all the rows in one go, I am using the following code but getting an error –
Code –
data[!is.finite(data)]
Error –
Error: default method not implemented for type 'list'
Can someone suggest a method to delete all rows with any NaN or Inf values from data.table in one go?
I do not want to use code like the one below as in such a case I have to name all the columns one by one to check for infinite values.
data[is.finite(data$LRR) & is.finite(data$LRR.var), ]
>Solution :
The columns are character class, thus is.infinite or is.finite doesn’t work as it expects numeric columns. According to ?is.infinite
is.infinite returns a vector of the same length as x the jth element of which is TRUE if x[j] is infinite (i.e., equal to one of Inf or -Inf) and FALSE otherwise. This will be false unless x is numeric or complex. Complex numbers are infinite if either the real or the imaginary part is.
> str(data)
Classes ‘data.table’ and 'data.frame': 7 obs. of 3 variables:
$ Group : chr "A" "B" "C" "D" ...
$ LRR : chr "Inf" "1" "2" "3" ...
$ LRR.var: chr "Inf" "Inf" "3" "-Inf" ...
> is.finite(data$LRR)
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> is.infinite(data$LRR)
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
We may need to convert to numeric before applying. As the data is a data.table, we may use data.table methods to subset
library(data.table)
data <- type.convert(data, as.is = TRUE)
data[data[, Reduce(`&`,
lapply(.SD, is.finite)), .SDcols = is.numeric]]
-output
Group LRR LRR.var
1: C 2 3
2: F 4 6
3: G 5 7
Note: The reason we get all character columns is because of creation of matrix from cbind (default is cbind.matrix) as matrix handle only a single class, it is converted to character class based on the column ‘Group’. Instead, create the data.table or data.frame directly
data <- data.table(Group, LRR, LRR.var)
> str(data)
Classes ‘data.table’ and 'data.frame': 7 obs. of 3 variables:
$ Group : chr "A" "B" "C" "D" ...
$ LRR : num Inf 1 2 3 -Inf ...
$ LRR.var: num Inf Inf 3 -Inf -Inf ...
Another option is if_all with filter from dplyr
library(dplyr)
data %>%
filter(if_all(where(is.numeric), is.finite))
Group LRR LRR.var
1: C 2 3
2: F 4 6
3: G 5 7