Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

create two dataframes at once and joint their columns alternatively in R

assume this is my dataframe ( already created by a sequence of pipes )

df <- data.frame(month = c("May", "May","May","June","June","June", "July", "July","July"),
                 rate = rep(c(0.3, 0.4, 0.5),times=3),
                 a = c(5,4,3,10,9,8,20,19,18),
                 b = c(100,200,300,1,2,3,10,20,30))
df

  month rate  a   b
1   May  0.3  5 100
2   May  0.4  4 200
3   May  0.5  3 300
4  June  0.3 10   1
5  June  0.4  9   2
6  June  0.5  8   3
7  July  0.3 20  10
8  July  0.4 19  20
9  July  0.5 18  30

I am looking to follow the dplyr pipes and create dataframe below:

  month r0.3_a r0.3_b r0.4_a r0.4_b r0.5_a r0.5_b
1   May      5    100      4    200      3    300
2  June     10      1      9      2      8      3
3  July     20     10     19     20     18     30

so this is what I did:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

library(tidyverse)
library(dplyr)
library(gtools)
df_a <- df %>%
      select(-b) %>%
      pivot_wider(names_from = rate,
                  values_from = a,
                  names_glue = "r{rate}_{.value}" )

df_b <- df %>%
  select(-a) %>%
  pivot_wider(names_from = rate,
              values_from = b,
              names_glue = "r{rate}_{.value}" )


output <- inner_join(df_a, df_b, by= "month") %>%
          select(month, mixedsort(names(.)[-1]))

My question is: assuming df is coming from a sequence of pipes, is there a way to achieve the output without having to stop the pipes and create two separate df s out of the original df?

>Solution :

Use values_from as a vector of columns

library(tidyr)
library(dplyr)
pivot_wider(df, names_from = rate, values_from = c(a, b),
     names_glue = "r{rate}_{.value}") %>% 
     select(month, gtools::mixedsort(names(.)[-1]))

-output

# A tibble: 3 Ă— 7
  month r0.3_a r0.3_b r0.4_a r0.4_b r0.5_a r0.5_b
  <chr>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>
1 May        5    100      4    200      3    300
2 June      10      1      9      2      8      3
3 July      20     10     19     20     18     30
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading