Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

R Combine two rows into one for multiple columns in a dataframe

I have a dataset that I am trying to tidy up using different approaches. For step one I want to merge every two rows in each of the columns to a single row as shown in the desired output.

How can I do this in R by the tidy way?

Sample Data

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Date = c("SB",
         "1/4/2021", 
         "HC/SB",
         "1/5/2021",
         "NC",
         "1/6/2021",
         "HC",
         "1/13/2021")

Date_Approved = c(" ",
                  "1/4/2021",
                  " ",
                  "1/8/2021",
                  " ",
                  "1/12/2021",
                  " ",
                  "1/15/2021")

SR = c(" ",
       "1A",
       " ",
       "1B",
       " ",
       "1C",
       " ",
       "1D")

Permit = c(" ",
       "AAA",
       " ",
       "BBB",
       " ",
       "CCC",
       " ",
       "DDD") 

Owner_Agent = c("Joe",
                "Joey",
                "Ross",
                "Chandler",
                "Monica",
                "Rachel",
                "Ed",
                "Edd",
                "Eddy")

Address = c("1111 W. Broward Boulevard",
            "Plantation, 33333",
            "2222 N 23 Avenue",
            "Hollywood, FL 33322",
            "3333 Taylor Street",
            "Hollywood, 33311",
            "44444 NW 19th St",
            "5555 Oak St",
            "Pembroke Pines, 33300") 

The original data looks like this:

enter image description here

Desired Output

Date                 Date_Approved  SR       Permit     Owner_Agent
 SB    1/4/2021      1/4/2021      1A        AAA        Joe, Joey
 HC/SB 1/5/2021      1/8/2021      1B        BBB        Chandler, Monica
 NC    1/6/2021      1/12/2021     1C        CCC        Rachel, Ed
 HC    1/13/2021     1/15/2021     1D        DDD        Edd, Eddy
 Address
 1111 W. Broward Boulevard Plantation, 33333
 2222 N 23 Avenue Hollywood, FL 33322
 3333 Taylor Street Hollywood, 33311
 44444 NW 19th St Pembroke Pines, 33300

I have looked up this and this, but using group_by messes up the df.

Code

library(tidyverse)

df = data.frame(Date,
                Date_Approved,
                SR,
                Permit,
                Owner_Agent,
                Address)

# Tidy up the df

df = df %>% 

>Solution :

You can try to create a row identifier, group by that id, and use summarize(across()) as below:

df %>% 
  mutate(id=rep(1:(n()/2), each=2)) %>% 
  group_by(id) %>% 
  summarize(across(Date:Address, ~trimws(paste0(.x, collapse=" "))))

Output:

# A tibble: 4 × 7
     id Date           Date_Approved SR    Permit Owner_Agent   Address                                    
  <int> <chr>          <chr>         <chr> <chr>  <chr>         <chr>                                      
1     1 SB 1/4/2021    1/4/2021      1A    AAA    Joe Joey      1111 W. Broward Boulevard Plantation, 33333
2     2 HC/SB 1/5/2021 1/8/2021      1B    BBB    Ross Chandler 2222 N 23 Avenue Hollywood, FL 33322       
3     3 NC 1/6/2021    1/12/2021     1C    CCC    Monica Rachel 3333 Taylor Street Hollywood, 33311        
4     4 HC 1/13/2021   1/15/2021     1D    DDD    Ed Eddy       44444 NW 19th St Pembroke Pines, 33300  

Input:

structure(list(Date = c("SB", "1/4/2021", "HC/SB", "1/5/2021", 
"NC", "1/6/2021", "HC", "1/13/2021"), Date_Approved = c(" ", 
"1/4/2021", " ", "1/8/2021", " ", "1/12/2021", " ", "1/15/2021"
), SR = c(" ", "1A", " ", "1B", " ", "1C", " ", "1D"), Permit = c(" ", 
"AAA", " ", "BBB", " ", "CCC", " ", "DDD"), Owner_Agent = c("Joe", 
"Joey", "Ross", "Chandler", "Monica", "Rachel", "Ed", "Eddy"), 
    Address = c("1111 W. Broward Boulevard", "Plantation, 33333", 
    "2222 N 23 Avenue", "Hollywood, FL 33322", "3333 Taylor Street", 
    "Hollywood, 33311", "44444 NW 19th St", "Pembroke Pines, 33300"
    )), class = "data.frame", row.names = c(NA, -8L))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading