Home Find the last element by group in a data.table without max()

Questions

Find the last element by group in a data.table without max()

January 23, 2024

I’m trying to find the last element in a data.table by group in a time efficient way. I have a solution that works:

library(data.table)
Data <- data.table(id = c(rep("a", 2), rep("b",3)), 
                   time = c(1:2, 1:3))
Data[, lastobs := max(time), by = id]
Data <- Data[time == lastobs]
Data[, lastobs := NULL]

but the max() by group command takes pretty long. In my still manageable dataset of 3 million observations and time being a yearmonth variable, it takes 20 seconds and I need to do it for many yearmonths. I want to move to a much larger dataset now where this becomes infeasible. I was thinking there must be a data.table way to do this by simply ordering the data.table by id and time and then using these .I, .N, or .SD shorthands that I never understand to simply keep the last element by group without having to calculate something like max() within each group. Is there such a solution? My attempt:

Data <- data.table(id = c(rep("a", 2), rep("b",3)), 
               time = c(1:2, 1:3))
Data[,.N == .I, by = id]

selects the last row of the first group and the first row of the second group because I don’t really understand this syntax…

>Solution :

Assuming the data.table is sorted (if not, use setorder):

Data[, .SD[.N,], by = id]
#   id time
#1:  a    2
#2:  b    3

From the documentation:

".SD is a data.table containing the Subset of x‘s Data for each group …"

".N is an integer, length 1, containing the number of rows in the group."

data.table

byMR

Published January 23, 2024

Add a comment

Why do I see a boolean value from EnumChildWindows when using this Powershell script?

byMR

January 23, 2024

Questions

Add vertical line to plot in R

byMR

January 23, 2024

Questions

Merge Resulted Column From Pandas Dataframe Based on Condition

byMR

January 23, 2024

Questions

How to access array of files from Laravel request

byMR

January 23, 2024

Questions

See the raw IMAP commands when using imaplib or imap_tools

byMR

January 23, 2024

Questions

Box test steps are causing an error in my Playwright test

byMR

January 23, 2024

Find the last element by group in a data.table without max()

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Why do I see a boolean value from EnumChildWindows when using this Powershell script?

Add vertical line to plot in R

Merge Resulted Column From Pandas Dataframe Based on Condition

How to access array of files from Laravel request

See the raw IMAP commands when using imaplib or imap_tools

Box test steps are causing an error in my Playwright test

Keep Up to Date with the Most Important News

Find the last element by group in a data.table without max()

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Why do I see a boolean value from EnumChildWindows when using this Powershell script?

Add vertical line to plot in R

Merge Resulted Column From Pandas Dataframe Based on Condition

How to access array of files from Laravel request

See the raw IMAP commands when using imaplib or imap_tools

Box test steps are causing an error in my Playwright test

Discover more from Dev solutions