Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

data.table text filtering R

I am trying to filter some text of a data.table looking for a similar way to dplyr::filter (I am using a data.table approach for efficiency reasons).

However, the filtering process in data.table only returns strings where the exact match is found. Contrarily, dplyr::filter returns rows where the pattern is found, not only when it is the exact pattern.

See below for an example.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df <- data.frame (first  = c("value_1 and value_2", "value_2", "value_1", "value_1"),
                  second = c(1, 2, 3, 4))

dt.output <- setDT(df)[first %in% c("value_1") ]
filter.output <- dplyr::filter(df, grepl("value_1", first))

dt.output only returns the rows that uniquely contain value_1 (3, 4).
filter.output returns rows that contains value_1 (1, 3, 4)

Is it possible to use data.table to filter text while returning the same results as dplyr::filter?

df <- data.frame (first  = c("value_1 and value_2", "value_2", "value_1", "value_1"),
                  second = c(1, 2, 3, 4))

dt.output <- setDT(df)[first %in% c("value_1") ]
filter.output <- dplyr::filter(df, grepl("value_1", first))

>Solution :

This behavior is not a dplyr::filter vs data.table. It is just that %in% is looking for fixed matches while grepl returns TRUE for substring matches as well. If we use grepl in the data.table, it works as well

library(data.table)
setDT(df)[grepl("value_1", first)]
                  first second
1: value_1 and value_2      1
2:             value_1      3
3:             value_1      4

Or may also use %like%

 setDT(df)[first %like% "value_1"]
                 first second
1: value_1 and value_2      1
2:             value_1      3
3:             value_1      4
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading