Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Filtering numeric data within a for-loop in R

I’ve got a series of seven (non-sequential) numbers and a large dataset, and I want to filter this data 7 times according to these 7 numbers, and then save them in a list. I would like the value column to be less than or equal to the value of the given number, vals, in each case.

The dataset looks something like this:

   subj <- c("A1", "A1", "A2", "A2", "A3", "A3", "A4", "A4", "A5", "A5")
   var1 <- c(1, 2, 1, 2, 1, 2, 1, 2, 1, 2)
   value <- c(.73, .2, .51, .45, .18, .43, .62, .02, 0, .11) 

   df <- data.frame(subj, var1, value)

And the code I’ve been working with looks like this:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

 vals <- c(0.15, 0.18, 0.19, 0.21, 0.24, 0.33, 0.50)
 output <- vector("list", length(7))

 for (i in 1:length(vals)) {   
   new_data <- df %>%   
     filter(value <= i)        
   output[[i]] <- new_data     
 }

This runs fine but it doesn’t filter the data, so I end up with a list where each of the 7 elements is exactly the same. I would like each of the 7 list elements to include only the rows where value <= i, and to discard those where value > i.

I’d also like the name of each element in the list to be the respective value from vals, if possible, though I can’t figure out how to do this with numeric data.

>Solution :

Please try

lapply(vals, \(i) subset(df, value <= i))

This gives a list of data frames. Each data frame is subsetted in correspondence to a value of vals.

If you like to use your explicit for-loop approach, then change to

library(dplyr)
output <- vector("list", length = length(vals))
for (v in seq(vals)) {  
  new_data <- df %>% 
    filter(value <= vals[[v]])
  output[[v]] <- new_data 
}

We can do this in base R as well:

output <- vector("list", length = length(vals))
for (v in seq(vals)) {  
  output[[v]] <- df[value <= vals[[v]], ] 
}
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading