Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to create a missing data for broken time in data.table?

here is a sample of my data ;

df <- data.table(Date = c(as.Date('2021-02-09'),as.Date('2021-02-10'),as.Date('2021-02-12'),as.Date('2021-02-13'),
                          as.Date('2021-02-14'),as.Date('2021-02-05'),as.Date('2021-02-07'),as.Date('2021-02-08')),
                 Store = c('A','A','A','A','B','B','B','B'),
                 Product = c(1,1,1,1,2,2,2,2),
                 Quantity = c(3,4,5,6,7,8,9,10))

for each group (Store and Product columns represents a group) sometimes can have a break in the Date column.

I want to keep them as missing in Quantity value. Here is my desired output;

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

  Date       Store Product Quantity
  <date>     <chr>   <dbl>    <dbl>
  2021-02-09 A           1        3
  2021-02-10 A           1        4
  2021-02-11 A           1       NA
  2021-02-12 A           1        5
  2021-02-13 A           1        6
  2021-02-14 B           2        7
  2021-02-05 B           2        8
  2021-02-06 B           2       NA
  2021-02-07 B           2        9
  2021-02-08 A           2       10

note : I have a limited ram space and a huge table. So operations without reassigning (for example with := operator) would be way better.

Thanks in advance.

>Solution :

A possible solution with J and seq.
Works for Store A, not sure I understood your question correctly for Store B as this creates more rows than your expected result due to Quantity registered on 2021-02-14

df[,.SD[J(Date=seq(min(Date),max(Date),by=1)),on=.(Date)],by=.(Store,Product)]

    Store Product       Date Quantity
    <char>   <num>     <Date>    <num>
 1:      A       1 2021-02-09        3
 2:      A       1 2021-02-10        4
 3:      A       1 2021-02-11       NA
 4:      A       1 2021-02-12        5
 5:      A       1 2021-02-13        6
 6:      B       2 2021-02-05        8
 7:      B       2 2021-02-06       NA
 8:      B       2 2021-02-07        9
 9:      B       2 2021-02-08       10
10:      B       2 2021-02-09       NA
11:      B       2 2021-02-10       NA
12:      B       2 2021-02-11       NA
13:      B       2 2021-02-12       NA
14:      B       2 2021-02-13       NA
15:      B       2 2021-02-14        7
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading