Home How to get mean and sd statistics instead of median and iqr when using tbl_continuous from gtsummary

Questions

How to get mean and sd statistics instead of median and iqr when using tbl_continuous from gtsummary

October 7, 2023

I am using the package gtsummary. To obtain descriptive results from ‘desg’ and ‘group’ variables crossed, I have used the tbl_continuous function.

I got ‘median’ and ‘iqr’ as result measures but I wanted instead mean and sd, as I wanted.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(gtsummary)

# Sample data
data <- data.frame(
  desg = c('a', 'b', 'c', 'a', 'b', 'c'),
  group = c('before', 'before', 'before', 'after', 'after', 'after'),
  values = c(10, 15, 12, 18, 22, 20)
)

data %>%
  select(desg, group, values) %>%
  tbl_continuous(variable = values, by = group) %>%
  modify_spanning_header(all_stat_cols() ~ "**Treatment Assignment**")
Characteristic  Treatment Assignment
after, N = 31   before, N = 31
desg        
    a   18.0 (18.0, 18.0)   10.0 (10.0, 10.0)
    b   22.0 (22.0, 22.0)   15.0 (15.0, 15.0)
    c   20.0 (20.0, 20.0)   12.0 (12.0, 12.0)
1 values: Median (IQR)
Created on 2023-10-07 with reprex v2.0.2

>Solution :

You need to provide the appropriate statistic parameter. The docs state:

statistic: List of formulas specifying types of summary statistics to display for each variable. The default is everything() ~ {median} ({p25}, {p75})

I have to say, the format for providing this list is not entirely clear to me from this, but fortunately the tbl_summary() docs are little more helpful:

The default is list(all_continuous() ~ "{median} ({p25}, {p75})", all_categorical() ~ "{n} ({p}%)")

In our case that means:

data %>%
    select(desg, group, values) %>%
    tbl_continuous(
        variable = values, by = group,
        statistic = list(
            everything() ~ "{mean} ({sd})"
        )
    ) %>%
    modify_spanning_header(all_stat_cols() ~ "**Treatment Assignment**")
# Characteristic    Treatment Assignment
# after, N = 31 before, N = 31
# desg
#     a 18.0 (NA)   10.0 (NA)
#     b 22.0 (NA)   15.0 (NA)
#     c 20.0 (NA)   12.0 (NA)
# 1 values: Mean (SD)

Your sample data only has one observation per group which is why the sd() is NA here.