Home tidyverse- Is pivot_wider() only way to summarize selecting specific row values?

Questions

tidyverse- Is pivot_wider() only way to summarize selecting specific row values?

April 15, 2023

I need to summarize an index of testing results from tidy data. For each group, I need to do a weighted sum of specific values to return a index value. I’m used to using group_by() and summarise() and to subset with the format df$value[var==’A’], but I can’t get that way to work. I can only get pivot_wider() to work.

#reprex
library(tidyverse)
#sample data
df <- data.frame(group = c('foo', 'foo', 'foo', 'foo','bar', 'bar', 'bar', 'bar'), 
                 var = c('a', 'b', 'c', 'd', 'a', 'b', 'c', 'd'), 
                 result = c(1, 6, 9, 3, 5, 0, 2, 9))

#this does not work, nor does using 'reframe()' as suggested by error
index <- df %>% 
  group_by(group) %>% 
  summarise(var = 'index', 
            result = result[var=='b']/2 + result[var=='d']/3)
#> Warning: Returning more (or less) than 1 row per `summarise()` group was deprecated in
#> dplyr 1.1.0.
#> i Please use `reframe()` instead.
#> i When switching from `summarise()` to `reframe()`, remember that `reframe()`
#>   always returns an ungrouped data frame and adjust accordingly.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.
#> `summarise()` has grouped output by 'group'. You can override using the
#> `.groups` argument.

#using pivot_wider works, is this the only way?
index <- df %>% 
  filter(var %in% c('b', 'd')) %>% 
  pivot_wider(names_from = var, values_from = result) %>% 
  mutate(index = b/2 + d/3) %>% 
  pivot_longer(cols = c('b', 'd', 'index'), 
               names_to = 'var', 
               values_to = 'result')

>Solution :

The problem is that you made var='index' first, then all the subsequent calculations using var will be wrong. If you change the order of result and var in your summarise() statement, it works:

library(tidyverse)
#sample data
df <- data.frame(group = c('foo', 'foo', 'foo', 'foo','bar', 'bar', 'bar', 'bar'), 
                 var = c('a', 'b', 'c', 'd', 'a', 'b', 'c', 'd'), 
                 result = c(1, 6, 9, 3, 5, 0, 2, 9))


index <- df %>% 
  group_by(group) %>% 
  summarise(result = result[var=='b']/2 + result[var=='d']/3, 
            var = 'index')
index
#> # A tibble: 2 × 3
#>   group result var  
#>   <chr>  <dbl> <chr>
#> 1 bar        3 index
#> 2 foo        4 index

^{Created on 2023-04-14 with reprex v2.0.2}