Home data.table, calculate difference to last day value

Questions

data.table, calculate difference to last day value

December 12, 2021

I’m having a data.table:

library(data.table)

dt = structure(list(date = c("01.01.2020", "01.01.2020", "02.01.2020", 
                             "02.01.2020", "03.01.2020", "03.01.2020", "04.01.2020", "04.01.2020"
), name = c("10AFC25D", "FA1A310C", "10AFC25D", "FA1A310C", "10AFC25D", 
            "FA1A310C", "10AFC25D", "FA1A310C"), value = c(100L, 50L, 80L, 
                                                           60L, 70L, 60L, 50L, 80L)), row.names = c(NA, -8L), class = c("data.table", "data.frame"))
dt[, date:=as.Date(date, format="%d.%m.%Y")]

That looks like:

> dt
         date     name value
1: 01.01.2020 10AFC25D   100
2: 01.01.2020 FA1A310C    50
3: 02.01.2020 10AFC25D    80
4: 02.01.2020 FA1A310C    60
5: 03.01.2020 10AFC25D    70
6: 03.01.2020 FA1A310C    60
7: 04.01.2020 10AFC25D    50
8: 04.01.2020 FA1A310C    80

Goal:
I want to calculate two new columns which gives the difference to last day value. One column shows the absolute difference and the other the relative difference. And the formular should be that flexibel that I could change the lag from 1 day to 7 days (if I want to compare the same weekdays) – or any other value.

The expected output should look like:

         date     name value diff_absolut diff_relative
1: 01.01.2020 10AFC25D   100           NA            NA
2: 01.01.2020 FA1A310C    50           NA            NA
3: 02.01.2020 10AFC25D    80          -20    -0.2000000
4: 02.01.2020 FA1A310C    60           10     0.2000000
5: 03.01.2020 10AFC25D    70          -10    -0.1250000
6: 03.01.2020 FA1A310C    60            0     0.0000000
7: 04.01.2020 10AFC25D    50          -20    -0.2857143
8: 04.01.2020 FA1A310C    80           20     0.3333333

I could solve it like:

dt2 = copy(dt)
dt2[, date:=date+days(1)]

dt_final = merge(dt, dt2, by=c("date", "name"), all.x=TRUE, suffixes=c("", "_2"))
dt_final[, `:=`(diff_absolute=value-value_2, diff_relative=(value-value_2)/value_2, value_2=NULL)]
dt_final

         date     name value diff_absolute diff_relative
1: 2020-01-01 10AFC25D   100            NA            NA
2: 2020-01-01 FA1A310C    50            NA            NA
3: 2020-01-02 10AFC25D    80           -20    -0.2000000
4: 2020-01-02 FA1A310C    60            10     0.2000000
5: 2020-01-03 10AFC25D    70           -10    -0.1250000
6: 2020-01-03 FA1A310C    60             0     0.0000000
7: 2020-01-04 10AFC25D    50           -20    -0.2857143
8: 2020-01-04 FA1A310C    80            20     0.3333333

This works correctly, but it doesn’t look really elegant and efficient. And as the original data has between 1 and 24 Mio rows I thought I better ask if someone has a smoother solution? Please data.table only. Thank you very much.

>Solution :

If you think of this in terms of rows this should do:

lag = 2L
dt[, diff_absolut := shift(value, n = lag) - value]
dt[, diff_relative := diff_absolut / shift(value, n = lag)]

data.table

byMR

Published December 12, 2021

Add a comment

Python:: Adding array from each iteration to create new larger array

byMR

December 12, 2021

Questions

How can I access the language folders generated by angular i18n?

byMR

December 12, 2021

Questions

.replace function now working as intented (python)

byMR

December 12, 2021

Questions

What's the best way to define a particular number with special characters in a database django

byMR

December 12, 2021

Questions

Exporting a Database using Oracle SQL Developer

byMR

December 12, 2021

Questions

Powershell: Replace all occurrences of different substrings starting with same Unicode char (Regex?)

byMR

December 13, 2021

data.table, calculate difference to last day value

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Python:: Adding array from each iteration to create new larger array

How can I access the language folders generated by angular i18n?

.replace function now working as intented (python)

What's the best way to define a particular number with special characters in a database django

Exporting a Database using Oracle SQL Developer

Powershell: Replace all occurrences of different substrings starting with same Unicode char (Regex?)

Keep Up to Date with the Most Important News

data.table, calculate difference to last day value

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Python:: Adding array from each iteration to create new larger array

How can I access the language folders generated by angular i18n?

.replace function now working as intented (python)

What's the best way to define a particular number with special characters in a database django

Exporting a Database using Oracle SQL Developer

Powershell: Replace all occurrences of different substrings starting with same Unicode char (Regex?)

Discover more from Dev solutions