Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to append only two columns of a data frame in R

lets say I have the following data frame:

dt <- data.frame(id= c(1),
                 parameter= c("a","b","c"),
                 start_day = c(1,8,4),
                 end_day = c(16,NA,30))

enter image description here

I need to combine start_day and end_day columns (lets call the new column as day) such that I reserve all the other columns. Also I need to create another column that indicates if each row is showing start_day or end_day. To clarify, I am looking to create the following data frame

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

enter image description here

I am creating the above data frame using the following code:

dt1 <- subset(dt, select = -c(end_day))
dt1 <- dt1 %>% rename(day = start_day)
dt1$start <- 1

dt2 <- subset(dt, select = -c(start_day))
dt2 <- dt2 %>% rename(day = end_day)
dt2$end <- 1

dt <- bind_rows(dt1, dt2)
dt <- dt[order(dt$id, dt$parameter),]

Although my code works, but I am not happy with my solution. I am certain that there is a better and cleaner way to do that. I would appreciate any input on better alternatives of tackling this problem.

>Solution :

(tidyr::pivot_longer(dt, cols = c(start_day, end_day), values_to = "day")                                                        
    |> dplyr::mutate(start = ifelse(name == "start_day", 1, NA),                                              
                     end = ifelse(name == "end_day", 1, NA))                                                  
)                                                                                                             

Result:

# A tibble: 6 × 6                                                                                            
      id parameter name        day start   end                                                                
   <dbl> <chr>     <chr>     <dbl> <dbl> <dbl>                                                                
 1     1 a         start_day     1     1    NA                                                                
 2     1 a         end_day      16    NA     1                                                                
 3     1 b         start_day     8     1    NA                                                                
 4     1 b         end_day      NA    NA     1                                                                
 5     1 c         start_day     4     1    NA                                                                
 6     1 c         end_day      30    NA     1   

You could get rid of the name column, but maybe it would be more useful than your new start/end columns?

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading