dplyr: swap end date with start date if end date is less than start date

Advertisements

I have a data frame with start and end dates, and I am calculating the difference in time between them using the difftime function. However, some of my start dates are greater than my end dates, which results in a negative time difference. I need to swap the start date for the end date and vice versa when this happens. How can I do this?

Here is an example data frame:

df1 <- data.frame(matrix(ncol = 3, nrow = 3))
colnames(df1)[1:3] <- c('date_start','date_end','diff')
df1$date_start <- c(as.Date('2004-11-09'),
                    as.Date('2020-01-01'),
                    as.Date('1992-09-01'))
df1$date_end <- c(as.Date('2005-11-09'),
                    as.Date('2010-12-31'),
                    as.Date('2006-10-31'))
df1$diff <- difftime(df1$date_end,df1$date_start)
df1

The correct data frame should look like this (with the start and end dates swapped in the second row):

  date_start   date_end      diff
1 2004-11-09 2005-11-09  365 days
2 2010-12-31 2020-01-01 3288 days
3 1992-09-01 2006-10-31 5173 days

>Solution :

df1 <- df1 |> 
  transform(date_start = pmin(date_start, date_end),
            date_end   = pmax(date_start, date_end))

(base::transform is similar to dplyr::mutate, but one difference is that each term is calculated based on the incoming data, not on "the data as it exists after modifications to that point." The code would not work like this with mutate since date_start would be overwritten after the first step.)

Leave a ReplyCancel reply