Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How can I match the values of multiple variables according to the dates of a variable of interest and summarise them alone in R using dplyr?

I have a dataset that looks like this :

var date value
A 2022-01-01 1
A 2022-01-02 2
A 2022-01-03 3
A 2022-01-04 4
A 2022-01-05 5
A 2022-01-06 6
A 2022-01-07 7
B 2022-02-02 10
B 2022-01-03 20
B 2022-01-07 30
C 2022-01-01 100
C 2022-01-04 200
C 2022-01-05 300
C 2022-06-06 400

My variable of interest is the A from column var.Specifically the dates of A that match the values of the other factors inn var variable.I want to pivot wider them into :

date A B C
2022-01-01 1 NA 100
2022-01-02 2 NA NA
2022-01-03 3 20 NA
2022-01-04 4 NA 200
2022-01-05 5 NA 300
2022-01-06 6 NA NA
2022-01-07 7 30 NA

And at the end to each column to summarize the sum (or even the correlation of A with B and the correlation A with C):

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

var Sum
A 28
B 50
C 600

How can I do it in R using dplyr package ?

library(tidyverse)
date = c(seq(as.Date("2022/1/1"), by = "day", length.out = 7),
      as.Date("2022/2/2"),as.Date("2022/1/3"),as.Date("2022/1/7"),
      as.Date("2022/1/1"),as.Date("2022/1/4"),as.Date("2022/1/5"),as.Date("2022/6/6"))
var = c(rep("A",7),rep("B",3),rep("C",4))
value = c(seq(1,7,1),10,20,30,100,200,300,400)
data = tibble(var,date,value);data

>Solution :

We may convert to ‘wide’ format with pivot_wider, filter out the NA element rows from A column, summarise acrossthe columns to get thesumand reshape to 'long' withpivot_longer`

library(dplyr)
library(tidyr)
pivot_wider(data, names_from = var, values_from = value) %>% 
   filter(!is.na(A)) %>% 
   summarise(across(A:C, sum, na.rm = TRUE)) %>% 
  pivot_longer(cols = everything(), names_to = 'var', values_to = 'Sum')
# A tibble: 3 × 2
  var     Sum
  <chr> <dbl>
1 A        28
2 B        50
3 C       600
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading