Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to calculate the difference between rows and divide the difference with the value from the previous row in R?

Let’s say I have the following dataframe:

    A    B    C
1  15   14   12
2   7    1    6
3   8   22    5
4  11    5    1
5   4   12    4

I want to calculate the difference between the rows and then divide the difference by the value of the previous row. This is done for each variable.

The result would be something like this:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

    A    B    C     A_r     B_r      C_r
1  15   14   12      NA      NA       NA
2   7    1    6   -0.53   -0.93    -0.50
3   8   22    5    0.14      21    -0.16
4  11    5    1     ...     ...      ...
5   4   12    4     ...     ...      ...

The general formula would be:

R(n) = [S(n) – S(n-1)] / S(n-1)

Where R represents the newly calculated variable and S represents the current variable the value R is being calculated for (A, B, C in this example).

I know I can use the diff function to calculate the difference but I don’t know how I’d divide that difference by the values of previous rows.

>Solution :

We can use across with lag – loop across all the columns (everything()), apply the formula, and create new columns by modifying the .names – i.e. adding suffix _r with the corresponding column names ({.col})

library(dplyr)
df1 <- df1 %>%
   mutate(across(everything(),  ~ (. - lag(.))/lag(.),
   .names = "{.col}_r"))

-output

df1
   A  B  C        A_r        B_r        C_r
1 15 14 12         NA         NA         NA
2  7  1  6 -0.5333333 -0.9285714 -0.5000000
3  8 22  5  0.1428571 21.0000000 -0.1666667
4 11  5  1  0.3750000 -0.7727273 -0.8000000
5  4 12  4 -0.6363636  1.4000000  3.0000000

Or use base R with diff

df1[paste0(names(df1), "_r")] <- rbind(NA, 
       diff(as.matrix(df1)))/rbind(NA, df1[-nrow(df1),])
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading