Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

slice_sample number of rows

This is a general question for the slice_sample process. From my original database I am doing sthg like this


df<-dat_longer %>% dplyr::select(grupo_int_v00, time, peso1 ,cintura1, hdl) %>% 
+     group_by(grupo_int_v00) %>% 
+     slice_sample(n = 20,replace=TRUE) %>% ungroup() %>% dput()

Therefore, I am getting this code:

df<-structure(list(grupo_int_v00 = structure(c(1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L), .Label = c("A", "B"), label = "Grupo de intervención", class = "factor"), 
    time = c(0, 0, 2, 0, 2, 1, 1, 2, 2, 1, 1, 0, 2, 1, 2, 0, 
    1, 2, 1, 0, 0, 2, 2, 1, 0, 2, 2, 1, 0, 2, 1, 0, 1, 0, 1, 
    2, 1, 0, 0, 0), peso1 = c(100.7, 93, 84.5, 110.2, 76.4, 90.7, 
    93.6, 90.2, 84.8, 82.1, 125.3, 80.2, 76, 64.5, 86.9, 99, 
    83.9, 96.1, 91.6, 89.9, 93.4, 98.8, 70, 67.7, 110.3, 75, 
    87.2, 97.9, 82.7, 69.5, 81.2, 98, 73.8, 91.2, 87, 95, 76.6, 
    103.2, 103.4, 60), cintura1 = c(116.5, 112, 107, 127, NA, 
    106, 98.5, 124, 103.5, 107, 133.5, 104.5, 104.5, 97, 104.5, 
    107, 116, 110, 109, 113, 107, 105, 98, 101, 132, NA, 96.5, 
    118, 110, 85, 106.5, 123, 108, 107.5, 112, 117, 97.5, 114, 
    119, 94), hdl = c(56, 47, 61, 54, NA, 80, 61, 76, 50, 71, 
    64, 47, 59, 61, 59, 49, 49, 68, 71, 59, 55, 43, 52, 53, 42, 
    NA, 40, 40, 58, 60, 53, 62, 56, 48, 58, 39, 54, 63, 45, 45
    )), row.names = c(NA, -40L), class = c("tbl_df", "tbl", "data.frame"
))

This code is made up 40 rows. But I am specifying 20 as n. I have gone through the arguments function but I don’t really understand what is going on

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Thanks in advance

>Solution :

This is because you use group_by which means it will return per group 20 samples. Here is an example using iris dataset:

iris %>% 
  group_by(Species) %>%
  slice_sample(n = 5)

Output:

# A tibble: 15 × 5
# Groups:   Species [3]
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species   
          <dbl>       <dbl>        <dbl>       <dbl> <fct>     
 1          4.8         3.4          1.9         0.2 setosa    
 2          5           3.3          1.4         0.2 setosa    
 3          5.2         3.5          1.5         0.2 setosa    
 4          4.5         2.3          1.3         0.3 setosa    
 5          5.1         3.8          1.5         0.3 setosa    
 6          5.6         3            4.5         1.5 versicolor
 7          6.5         2.8          4.6         1.5 versicolor
 8          5.8         2.6          4           1.2 versicolor
 9          5.5         2.4          3.7         1   versicolor
10          6.4         3.2          4.5         1.5 versicolor
11          6.7         3.3          5.7         2.1 virginica 
12          6.7         3            5.2         2.3 virginica 
13          5.7         2.5          5           2   virginica 
14          5.8         2.8          5.1         2.4 virginica 
15          7.2         3.2          6           1.8 virginica 

When using no group_by:

iris %>% 
  slice_sample(n = 5)

Output:

  Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
1          5.2         3.4          1.4         0.2     setosa
2          6.6         2.9          4.6         1.3 versicolor
3          7.2         3.6          6.1         2.5  virginica
4          5.5         3.5          1.3         0.2     setosa
5          4.7         3.2          1.6         0.2     setosa

It returns 5 samples.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading