Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

pivot_longer with names_pattern and pairs of columns

I am trying to figure out how to use pivot_longer from tidyr in the following example. This is how the original table called dat_plot is structured like:

   year organizational_based action_based ideological_based share_org_based share_ideo_based share_act_based
  <dbl>                <dbl>        <dbl>             <dbl>           <dbl>            <dbl>           <dbl>
1  1956                    1            0                 0               2               95              95
2  2000                    0            0                 0              92               87              91

also here:

dat_plot <- structure(list(year = c(1956, 2000), organizational_based = c(1, 
0), action_based = c(0, 0), ideological_based = c(0, 0), share_org_based = c(2, 
92), share_ideo_based = c(95, 87), share_act_based = c(95, 91
)), row.names = c(NA, -2L), class = c("tbl_df", "tbl", "data.frame"
))

I would like to turn it into the long format, in the following way:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

  year          based based_value      share share_value
1 1956 organizational           1  org_based           2
2 1956         action           0 ideo_based          95
3 1956    ideological           0  act_based          95
4 2000 organizational           0  org_based          92
5 2000         action           0 ideo_based          87
6 2000    ideological           0  act_based          91

or, with dput:

solution <- structure(list(year = c(1956, 1956, 1956, 2000, 2000, 2000), 
    based = c("organizational", "action", "ideological", "organizational", 
    "action", "ideological"), based_value = c(1, 0, 0, 0, 0, 
    0), share = c("org_based", "ideo_based", "act_based", "org_based", 
    "ideo_based", "act_based"), share_value = c(2, 95, 95, 92, 
    87, 91)), class = "data.frame", row.names = c(NA, -6L))

I thought I have to work with names_pattern, what I have tried is something like this, but if you try you will see, that it is not what I want:

pivot_longer(data=dat_plot, cols=c("share_org_based", "share_ideo_based", "share_act_based",
                    "organizational_based", "action_based", "ideological_based"),
             names_pattern = c("(share_[A-Za-z]+)([A-Za-z]+_based)"),
             names_to = c("share", ".value"),
             values_to = "value")

I am grateful for any leads as to how names_pattern works, or what am I missing.

>Solution :

Here’s another way with tidyr:

dat_plot %>% 
  pivot_longer(cols = starts_with("share"), names_to = "share", names_prefix = "share_", values_to = "share_value") %>%
  pivot_longer(cols = ends_with("based"), names_to = "based", names_pattern = "(.*)_based", values_to = "based_value") %>% 
  filter(substr(share, 1, 3) == substr(based, 1, 3))

output

# A tibble: 6 x 5
   year share      share_value based          based_value
  <dbl> <chr>            <dbl> <chr>                <dbl>
1  1956 org_based            2 organizational           1
2  1956 ideo_based          95 ideological              0
3  1956 act_based           95 action                   0
4  2000 org_based           92 organizational           0
5  2000 ideo_based          87 ideological              0
6  2000 act_based           91 action                   0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading