Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Remove all rows from data.table if there is any infinite value

In the toy example below, I want to delete all rows that have Inf or Nan values. In my actual data.table, there are much more columns.

Group<-c("A","B","C","D","E","F","G")
 LRR <- c(Inf, 1,2,3,-Inf,4, 5)
 LRR.var <- c(NaN, Inf, 3, -Inf, -Inf, 6,7)
 data<-data.table(cbind(Group, LRR, LRR.var))
 data

 Group  LRR  LRR.var
 A      Inf  NaN
 B      1    Inf
 C      2    3
 D      3   -Inf
 E     -Inf -Inf
 F      4    6
 G      5    7

To delete all the rows in one go, I am using the following code but getting an error –

Code –

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

data[!is.finite(data)]

Error –

Error: default method not implemented for type 'list'

Can someone suggest a method to delete all rows with any NaN or Inf values from data.table in one go?

I do not want to use code like the one below as in such a case I have to name all the columns one by one to check for infinite values.

data[is.finite(data$LRR) & is.finite(data$LRR.var), ]

>Solution :

The columns are character class, thus is.infinite or is.finite doesn’t work as it expects numeric columns. According to ?is.infinite

is.infinite returns a vector of the same length as x the jth element of which is TRUE if x[j] is infinite (i.e., equal to one of Inf or -Inf) and FALSE otherwise. This will be false unless x is numeric or complex. Complex numbers are infinite if either the real or the imaginary part is.

> str(data)
Classes ‘data.table’ and 'data.frame':  7 obs. of  3 variables:
 $ Group  : chr  "A" "B" "C" "D" ...
 $ LRR    : chr  "Inf" "1" "2" "3" ...
 $ LRR.var: chr  "Inf" "Inf" "3" "-Inf" ...
> is.finite(data$LRR)
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> is.infinite(data$LRR)
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE

We may need to convert to numeric before applying. As the data is a data.table, we may use data.table methods to subset

library(data.table)
data <- type.convert(data, as.is = TRUE)
data[data[, Reduce(`&`,
     lapply(.SD, is.finite)), .SDcols = is.numeric]]

-output

    Group LRR LRR.var
1:     C   2       3
2:     F   4       6
3:     G   5       7

Note: The reason we get all character columns is because of creation of matrix from cbind (default is cbind.matrix) as matrix handle only a single class, it is converted to character class based on the column ‘Group’. Instead, create the data.table or data.frame directly

data <- data.table(Group, LRR, LRR.var)
> str(data)
Classes ‘data.table’ and 'data.frame':  7 obs. of  3 variables:
 $ Group  : chr  "A" "B" "C" "D" ...
 $ LRR    : num  Inf 1 2 3 -Inf ...
 $ LRR.var: num  Inf Inf 3 -Inf -Inf ...

Another option is if_all with filter from dplyr

library(dplyr)
data %>% 
  filter(if_all(where(is.numeric), is.finite))
   Group LRR LRR.var
1:     C   2       3
2:     F   4       6
3:     G   5       7
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading