Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Define custom transformation of ggplot axis labels with trans_new function

I am working on percentage changes between periods and struggling with logaritmic transformation of labels. Here is an example based on the storms dataset:

library(dplyr)
library(ggplot2)
library(scales)

df <- storms |>
  group_by(year) |>
  summarise(wind = mean(wind)) |>
  mutate(lag = lag(wind, n = 1)) |>
  mutate(perc = (wind / lag) - 1) |>
  tidyr::drop_na()

I want to visualize the distribution of percentages, making the percentage change symmetrical (log difference) with log1p.

ggplot(df, aes(x = log1p(perc))) +
  geom_histogram(bins = 5) 

x-axis with log1p values

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

At this point I wanted to transform the x-axis label back to the original percentage value.
I tried to create my own transformation with trans_new, and applied it to the labels in scale_x_continuous, but I can’t make it work.

trans_perc <- trans_new(
  name      = "trans_perc",
  transform = log1p_trans(),
  inverse   = function(x)
    expm1(x),
  breaks    = breaks_log(),
  format    = percent_format(),
  domain    = c(-Inf, Inf)
)

ggplot(df, aes(x = log1p(perc))) +
  geom_histogram(bins = 5) +
  scale_x_continuous(labels = trans_perc)

Currently, the result is:

Error in get_labels():
! breaks and labels are different lengths
Run rlang::last_error() to see where the error occurred.

Thanks!

>Solution :

If I understand you correctly, you want to keep the shape of the histogram, but change the labels so that they reflect the value of the perc column rather the transformed log1p(perc) value. If that is the case, there is no need for a transformer object. You can simply put the reverse transformation (plus formatting) as a function into the labels argument of scale_x_continuous:

ggplot(df, aes(x = log1p(perc))) +
  geom_histogram(bins = 5) +
  scale_x_continuous("Percentage Change", labels = ~ percent(expm1(.x))

enter image description here

Note that although the histogram remains symmetrical in shape, the axis labels represent the back-transformed values of the original axis labels.

If you want the breaks to be at round numbers on the back-transformed scale, then you can do:

ggplot(df, aes(x = log1p(perc))) +
  geom_histogram(bins = 5) +
  scale_x_continuous("Percentage Change", 
                     breaks = log1p(pretty(df$perc, 5)),
                     labels = ~ percent(expm1(.x)))

enter image description here

I think this second version is preferable, because it shows the log nature of the x scale more clearly (including the logarithmically spaced breaks / grid)

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading