Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Slice out sequence of grouped rows

I have this data:

df <- data.frame(
  node = c("A", "B", "A", "A", "A", "B", "A", "A", "A", "B", "B", "B", "B"),
  left = c("ab", "ab", "ab", "ab", "cc", "xx", "cc", "ab", "zz", "xx", "xx", "zz", "zz")
)

I want to count grouped frequencies and proportions and slice/filter out a sequence of grouped rows. Say, given the small dataset, I want to have the rows with the two highest Freq_left values per group. How can that be done? I can only extract the rows with the maximum Freq_left values but not the desired sequence of rows:

df %>%
  group_by(node, left) %>%
  # summarise
  summarise(
    Freq_left = n(),
    Prop_left = round(Freq_left/nrow(.)*100, 4)
    ) %>%
  slice_max(Freq_left)
# A tibble: 2 × 4
# Groups:   node [2]
  node  left  Freq_left Prop_left
  <chr> <chr>     <int>     <dbl>
1 A     ab            4      30.8
2 B     xx            3      23.1

Expected output:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

  node  left  Freq_left Prop_left
  <chr> <chr>     <int>     <dbl>
  A     ab            4     30.8 
  A     cc            2     15.4 
  B     xx            3     23.1 
  B     zz            2     15.4

>Solution :

You could use dplyr::top_n or dplyr::slice_max:

Thanks to @PaulSmith for pointing out that dplyr::top_n is superseded in favor of dplyr::slice_max:

library(dplyr)

df %>%
  group_by(node, left) %>%
  # summarise
  summarise(
    Freq_left = n(),
    Prop_left = round(Freq_left/nrow(.)*100, 4)
  ) %>%
  slice_max(order_by = Prop_left, n = 2)
#> `summarise()` has grouped output by 'node'. You can override using the `.groups` argument.
#> # A tibble: 4 × 4
#> # Groups:   node [2]
#>   node  left  Freq_left Prop_left
#>   <chr> <chr>     <int>     <dbl>
#> 1 A     ab            4      30.8
#> 2 A     cc            2      15.4
#> 3 B     xx            3      23.1
#> 4 B     zz            2      15.4
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading