Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Randomly selecting 4 unique rows of a data frame in r

I am working with R, and my data looks similar to this…

group  col_2  col_3   col_4
A      p_m     12      21
A      q_x     11      21
A      i_z     13      22
B      q_z     11      24
B      p_x     14      25
B      i_m     15      26
B      q_m     17      28
C      p_x     16      29
C      i_z     12      23
C      q_m     14      23
C      q_x     13      25 
D      p_z     11      25
D      i_z     15      26
D      q_m     17      28
D      q_x     14      29
E      p_x     13      30
E      i_m     15      26
E      q_m     17      28
E      p_x     16      29
F      i_z     12      23
F      q_x     13      25 
F      p_z     11      25
F      i_z     15      26
G      q_m     17      28
G      q_z     11      24
G      p_x     14      25
G      i_m     15      26
H      q_x     11      21
H      i_z     13      22
H      q_z     11      24
H      p_x     13      30

I need to randomly select 4 rows based on the group column. In other words, my output should not contain two observations that belong to the same group.

So I can get a result that looks like this …

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

group  col_2  col_3   col_4
A      i_z     13      22
H      i_z     13      22
D      q_m     17      28
F      p_z     11      25

I have tried things like this.

set.seed(1234)
rndmData <- mydata %>%
  sample_n(5)

set.seed(1234)
rndmData <- mydata %>%
  sample_n(distinct(group), 5)

set.seed(1234)
rndmData <- mydata %>%
  sample_n(unique(group), 5)

However, none of them led me to the desired result.

Any help would be great.

>Solution :

Sample 4 groups, then sample one row from within each group:

mydata %>%
  filter(group %in% sample(unique(group), size = 4)) %>%
  group_by(group) %>%
  slice_sample(n = 1) %>%
  ungroup()
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading