collapse duplicate rows into a single row by "|"

I have a data frame which looks like:

# A tibble: 19 × 5
   Date       Country `Implementing Institution(s)`                                                    `Policy Tool(s)`                                        Magnitude
   <date>     <chr>   <chr>                                                                            <chr>                                                   <chr>    
 1 2020-05-14 Spain   Ministry of Economic Affairs and Digital Transformation,European Investment Bank Lending                                                 1.5      
 2 2020-05-28 Spain   European Investment Bank                                                         Lending                                                 250      
 3 2020-06-16 Spain   European Investment Bank                                                         Credit guarantees                                       576      
 4 2020-07-01 Spain   European Investment Bank                                                         Lending                                                 900      
 5 2020-07-01 Spain   European Investment Bank                                                         Credit guarantees                                       523      
 6 2020-09-01 Spain   Ministry of Labor and Social Economy                                             Grants                                                  16.8     
 7 2020-05-13 Spain   Development Bank of Latin America                                                Grants                                                  400      
 8 2020-10-13 Spain   Council of Ministers,Government                                                  Grants                                                  NA       
 9 2020-11-17 Spain   European Commission                                                              Lending                                                 4  

I would like to collapse the rows where there are "duplicates" in the "Date" column. That is, row 4 and 5 with the date = 2020-07-01 should be collapsed down into the following single row:

From this:

4 2020-07-01 Spain   European Investment Bank  Lending            900      
 5 2020-07-01 Spain   European Investment Bank Credit guarantees  523      

To this:

2020-07-01 Spain   European Investment Bank   Lending | Credit guarantees  900 | 523 

Data:

df <- structure(list(Date = structure(c(18396, 18410, 18429, 18444, 
18444, 18506, 18395, 18548, 18583, 18369, 18369, 18375, 18376, 
18474, 18396, 18403, 18619, 18705, 18717), class = "Date"), Country = c("Spain", 
"Spain", "Spain", "Spain", "Spain", "Spain", "Spain", "Spain", 
"Spain", "Spain", "Spain", "Spain", "Spain", "Spain", "Spain", 
"Spain", "Spain", "Spain", "Spain"), `Implementing Institution(s)` = c("Ministry of Economic Affairs and Digital Transformation,European Investment Bank", 
"European Investment Bank", "European Investment Bank", "European Investment Bank", 
"European Investment Bank", "Ministry of Labor and Social Economy", 
"Development Bank of Latin America", "Council of Ministers,Government", 
"European Commission", "Council of Europe Development Bank", 
"Council of Europe Development Bank", "Council of Europe Development Bank", 
"European Commission", "European Commission", "European Investment Bank", 
"European Investment Bank", "European Investment Bank", "Council of Europe Development Bank", 
"European Investment Bank"), `Policy Tool(s)` = c("Lending", 
"Lending", "Credit guarantees", "Lending", "Credit guarantees", 
"Grants", "Grants", "Grants", "Lending", "Lending", "Lending", 
"Lending", "Credit guarantees,Lending,Capital injections,Tax relief", 
"Capital injections", "Lending", "Lending", "Lending", "Lending", 
"Lending"), Magnitude = c("1.5", "250", "576", "900", "523", 
"16.8", "400", NA, "4", "200", "300", "334", NA, "10", "1.5", 
"600", "270", "50", "300")), row.names = c(NA, -19L), class = c("tbl_df", 
"tbl", "data.frame"))

>Solution :

You can use the following code:

library(dplyr)
df %>%
  group_by(Date) %>% 
  summarise(across(everything(), ~ paste(unique(.x[!is.na(.x)]), collapse = "|"))) %>% 
  ungroup()

Output:

# A tibble: 16 × 5
   Date       Country `Implementing Institution(s)`                         `Policy Tool(s)` Magnitude
   <date>     <chr>   <chr>                                                 <chr>            <chr>    
 1 2020-04-17 Spain   Council of Europe Development Bank                    Lending          "200|300"
 2 2020-04-23 Spain   Council of Europe Development Bank                    Lending          "334"    
 3 2020-04-24 Spain   European Commission                                   Credit guarante… ""       
 4 2020-05-13 Spain   Development Bank of Latin America                     Grants           "400"    
 5 2020-05-14 Spain   Ministry of Economic Affairs and Digital Transformat… Lending          "1.5"    
 6 2020-05-21 Spain   European Investment Bank                              Lending          "600"    
 7 2020-05-28 Spain   European Investment Bank                              Lending          "250"    
 8 2020-06-16 Spain   European Investment Bank                              Credit guarante… "576"    
 9 2020-07-01 Spain   European Investment Bank                              Lending|Credit … "900|523"
10 2020-07-31 Spain   European Commission                                   Capital injecti… "10"     
11 2020-09-01 Spain   Ministry of Labor and Social Economy                  Grants           "16.8"   
12 2020-10-13 Spain   Council of Ministers,Government                       Grants           ""       
13 2020-11-17 Spain   European Commission                                   Lending          "4"      
14 2020-12-23 Spain   European Investment Bank                              Lending          "270"    
15 2021-03-19 Spain   Council of Europe Development Bank                    Lending          "50"     
16 2021-03-31 Spain   European Investment Bank                              Lending          "300" 

Leave a Reply