Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How can I aggregate the data on the same line

After a pivot_wider(), I get the following data.frame.
How can I aggregate the data on the same line?

  CODE_C   CODE_P LIB_COMPOSANT LIB_PRODUIT `2020-01-01` `2020-02-01` `2020-03-01` `2020-04-01` `2020-05-01` `2020-06-01`
  <chr>    <chr>  <chr>         <chr>              <dbl>        <dbl>        <dbl>        <dbl>        <dbl>        <dbl>
1 FABR**01 FABR** Abricot, 82   Abricot             1.32        NA           NA           NA           NA           NA   
2 FABR**01 FABR** Abricot, 82   Abricot            NA            1.10        NA           NA           NA           NA   
3 FABR**01 FABR** Abricot, 82   Abricot            NA           NA            3.33        NA           NA           NA   
4 FABR**01 FABR** Abricot, 82   Abricot            NA           NA           NA            4.71        NA           NA   
5 FABR**01 FABR** Abricot, 82   Abricot            NA           NA           NA           NA            4.38        NA   
6 FABR**01 FABR** Abricot, 82   Abricot            NA           NA           NA           NA           NA            3.25

To reproduce the data frame :

structure(list(CODE_C = c("FABR**01", "FABR**01", "FABR**01", 
"FABR**01", "FABR**01", "FABR**01"), CODE_P = c("FABR**", "FABR**", 
"FABR**", "FABR**", "FABR**", "FABR**"), LIB_COMPOSANT = c("Abricot, 82", 
"Abricot, 82", "Abricot, 82", "Abricot, 82", "Abricot, 82", "Abricot, 82"
), LIB_PRODUIT = c("Abricot", "Abricot", "Abricot", "Abricot", 
"Abricot", "Abricot"), `2020-01-01` = c(1.32446153846154, NA, 
NA, NA, NA, NA), `2020-02-01` = c(NA, 1.09984615384615, NA, NA, 
NA, NA), `2020-03-01` = c(NA, NA, 3.33157894736842, NA, NA, NA
), `2020-04-01` = c(NA, NA, NA, 4.70916279069767, NA, NA), `2020-05-01` = c(NA, 
NA, NA, NA, 4.37848648648649, NA), `2020-06-01` = c(NA, NA, NA, 
NA, NA, 3.24713953488372)), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame"))

The initial pivot_wider() looks like this :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

pivot_wider(DONNEES_COMPOSANT,
            names_from = date,
            values_from = PRIX)

Expected output :

 CODE_C   CODE_P LIB_COMPOSANT LIB_PRODUIT `2020-01-01` `2020-02-01` `2020-03-01` `2020-04-01` `2020-05-01` `2020-06-01`
  <chr>    <chr>  <chr>         <chr>              <dbl>        <dbl>        <dbl>        <dbl>        <dbl>        <dbl>
1 FABR**01 FABR** Abricot, 82   Abricot             1.32        1.10           3.33           4.71           4.38           3.25   

>Solution :

I’ll infer that your input data is

DONNEES_COMPOSANT <- structure(list(CODE_C = c("FABR**01", "FABR**01", "FABR**01", "FABR**01", "FABR**01", "FABR**01"), CODE_P = c("FABR**", "FABR**", "FABR**", "FABR**", "FABR**", "FABR**"), LIB_COMPOSANT = c("Abricot, 82", "Abricot, 82", "Abricot, 82", "Abricot, 82", "Abricot, 82", "Abricot, 82"), LIB_PRODUIT = c("Abricot", "Abricot", "Abricot", "Abricot", "Abricot", "Abricot"), date = c("2020-01-01", "2020-02-01", "2020-03-01", "2020-04-01", "2020-05-01", "2020-06-01"), PRIX = c(1.32446153846154, 1.09984615384615,  3.33157894736842, 4.70916279069767, 4.37848648648649, 3.24713953488372)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -6L))
DONNEES_COMPOSANT
# # A tibble: 6 × 6
#   CODE_C   CODE_P LIB_COMPOSANT LIB_PRODUIT date        PRIX
#   <chr>    <chr>  <chr>         <chr>       <chr>      <dbl>
# 1 FABR**01 FABR** Abricot, 82   Abricot     2020-01-01  1.32
# 2 FABR**01 FABR** Abricot, 82   Abricot     2020-02-01  1.10
# 3 FABR**01 FABR** Abricot, 82   Abricot     2020-03-01  3.33
# 4 FABR**01 FABR** Abricot, 82   Abricot     2020-04-01  4.71
# 5 FABR**01 FABR** Abricot, 82   Abricot     2020-05-01  4.38
# 6 FABR**01 FABR** Abricot, 82   Abricot     2020-06-01  3.25

To get what you want, we need to specify your first four columns as id_cols:

pivot_wider(DONNEES_COMPOSANT, id_cols = c(CODE_C, CODE_P, LIB_COMPOSANT, LIB_PRODUIT), names_from = "date", values_from = "PRIX")
# # A tibble: 1 × 10
#   CODE_C   CODE_P LIB_COMPOSANT LIB_PRODUIT `2020-01-01` `2020-02-01` `2020-03-01` `2020-04-01` `2020-05-01` `2020-06-01`
#   <chr>    <chr>  <chr>         <chr>              <dbl>        <dbl>        <dbl>        <dbl>        <dbl>        <dbl>
# 1 FABR**01 FABR** Abricot, 82   Abricot             1.32         1.10         3.33         4.71         4.38         3.25
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading