Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Find specific value in nested data frames and get position and/or value

I have a nested list sampleList that can contain a variable number of data frames. In this example there are 3 data frames:

df1 <- data.frame(id = as.integer(c(1, 6)), key = c('apple', 'apple.green'), stringsAsFactors=FALSE)
df2 <- data.frame(id = as.integer(c(1, 3, 5)), key = c('apple', 'apple.red', 'apple.red.rotten'), stringsAsFactors = FALSE)
df3 <- data.frame(id = as.integer(c(17)), key = c('orange'), stringsAsFactors = FALSE)
sampleList <- list(df1, df2, df3)

I want to search for specific integers e.g. 6 in the id column across all data frames contained in the sampleList. As a result, I need the position and if possible the associated value from the key column.

The closest I got was the position in a specific data frame e.g. 1.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

which(sampleList[[1]] == 6)
[1] 2

Since the number of data frames can be different each time, I need a more dynamic query.

Thanks a lot for your help.

>Solution :

I have slightly altered the data, adding 6 to df3.

df3 <- data.frame(id = as.integer(c(17, 6)), key = c('orange', "blue"), stringsAsFactors = FALSE)

Filter(nrow, 
       lapply(sampleList, subset, id == 6)
)
[[1]]
  id         key
2  6 apple.green

[[2]]
  id  key
2  6 blue

Explanation: We can first subset the list elements based on criteria, and later Filter out those that have nrow of 0, since F == 0.

To extract the positions (stored as rownames of the data.frames),

Filter(nrow, 
       lapply(sampleList, subset, id == 6)
) |>
  lapply(\(x) as.integer(rownames(x)))

To make it clear in which data.frame matches were found,

Filter(nrow, 
       lapply(sampleList, subset, id == 6) |>
         setNames(1:length(sampleList)) # swap to appropriate naming policy
) |>
  lapply(\(x) as.integer(rownames(x)))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading