Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Is there a way to balance data in R without reordering a dataframe?

First, here is some toy data:

df <- data.frame(
  "stim" = c("face", "object", "pareidolia", "face", "face", "object", "pareidolia", "object"),
  "RT" = c(23, 24, 25, 26, 27, 22, 25, 23),
  "Opac" = c(70, 60, 80, 65, 60, 61, 59, 70)
)

I want to ensure that there are equal numbers of each stim variable in the dataset. I am using the following code to attempt this:

library(dplyr)

newdf <- df %>%
  mutate(mn = min(table(stim))) %>%
  group_by(stim) %>%
  sample_n(mn[1]) %>%
  ungroup()

This works almost perfectly, except that it reorders the data. My desired output would look like the following:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

stim   RT   Opac
face   23   70
object 24   60
pareidolia 25 80
face   26   65
object 22   61
pareidolia 25 59

But this code outputs this:

stim   RT   Opac
face   23   70
face   26   65
object 24   60
object 22   61
pareidolia 25 80
pareidolia 25 59

I realize that this is likely happening because I am using table(), but I’m not sure how else to go about this. Any suggestions would be appreciated.

Also, bonus side question: is there a way to determine (a function, code snippet, etc) the row number where the data is being cut from as part of this process?

>Solution :

You could use a filtering strategy rather than slice_n

df %>%
  mutate(mn = min(table(stim))) %>%
  filter(sample(seq_along(stim)) <= mn, .by=stim)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading