Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to subset a dataframe based on values in a list column

I’m having an issue where I’m pulling down information from an API, and there are nested values within specific columns. I need to filter on those values in order to return the information I need. Here’s an example:

library(dplyr)

# Make Data
problem <- list(list("thing 1", "thing 2"), list("thing 1", "thing 2", "thing 3"), list("thing 1"))
name <- list("joe", "sue", "nancy")

df<-data.frame(name=c("joe", "sue", "nancy"),problem=I(problem))

# How can I find subset rows where the problem column contains "thing 3"
filter(df, name == "sue") # this works fine
filter(df, "thing 3" %in% problem) # this doesn't

It’s obvious to me that it’s because the list is nested and filter() isn’t "seeing" the data, but it’s less clear to me how to get around it. Additionally, the data that I’m returning is fairly large, and has an arbitrary number of items per list within the column, so I don’t want to unnest the column if I can avoid it.

#EDIT: I’m not married to a dplyr solution, and in fact if there is a data.table solution, I’d be especially interested to hear it, but I’m great with base or whatever!

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Any help would be appreciated.

>Solution :

df %>%
  filter(map_lgl(problem, ~any('thing 3' == .x)))

  name      problem
1  sue thing 1,....
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading