Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How can I rewrite a dplyr::top_n() call with weight using a dplyr::slice_* function

I would like to replace the superseded top_n() call in the code below with the recommended slice_max() function but I don’t see how to request weighting with slice_max().

top10 <- 
  structure(
    list(
      Variable = c("tfidf_text_crossing", "tfidf_text_best", 
                   "tfidf_text_amazing", "tfidf_text_fantastic",
                   "tfidf_text_player", "tfidf_text_great",
                   "tfidf_text_10", "tfidf_text_progress", 
                   "tfidf_text_relaxing", "tfidf_text_fix"), 
      Importance = c(0.428820580430941, 0.412741988094224,
                     0.368676982306671, 0.361409225854695, 
                     0.331176924533776, 0.307393456208119,
                     0.293945850296236, 0.286313554816565, 
                     0.283457020779205, 0.27899280757397), 
      Sign = c(tfidf_text_crossing = "POS", tfidf_text_best = "POS", 
               tfidf_text_amazing = "POS", tfidf_text_fantastic = "POS", 
               tfidf_text_player = "NEG", tfidf_text_great = "POS", 
               tfidf_text_10 = "POS", tfidf_text_progress = "NEG", 
               tfidf_text_relaxing = "POS", tfidf_text_fix = "NEG")
    ), 
    row.names = c(NA, -10L), 
    class = c("vi", "tbl_df", "tbl", "data.frame"), 
    type = "|coefficient|"
  )

suppressPackageStartupMessages(library(dplyr))

top10 |> 
  group_by(Sign) |> 
  top_n(2, wt = abs(Importance))
#> # A tibble: 4 × 3
#> # Groups:   Sign [2]
#>   Variable            Importance Sign 
#>   <chr>                    <dbl> <chr>
#> 1 tfidf_text_crossing      0.429 POS  
#> 2 tfidf_text_best          0.413 POS  
#> 3 tfidf_text_player        0.331 NEG  
#> 4 tfidf_text_progress      0.286 NEG

Created on 2023-01-06 with reprex v2.0.2

I think I will get the correct answers with:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

top10 |> 
  group_by(Sign) |> 
  arrange(desc(abs(Importance))) |> 
  slice_head(n = 2)

but that is far less readable for the novices that I am teaching. Is there an obvious way to do this with a slice_* functions?

>Solution :

You can handle the arrangeing of data with order_by=, which should make it more readable (and it does mimic your top_n code).

top10 |>
  group_by(Sign) |>
  slice_max(n = 2, order_by = abs(Importance))
# # A tibble: 4 × 3
# # Groups:   Sign [2]
#   Variable            Importance Sign 
#   <chr>                    <dbl> <chr>
# 1 tfidf_text_player        0.331 NEG  
# 2 tfidf_text_progress      0.286 NEG  
# 3 tfidf_text_crossing      0.429 POS  
# 4 tfidf_text_best          0.413 POS  
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading