I have a data frame which contains an unknown amount of columns. The data frame is generated from a previous step that delimits a string by ‘&’. The number of columns generated depends on the number & in the string. Irregardless of the number of columns, I need to remove the first two characters of the string if the string contains the a dash in the 5th position. Whether the original columns get overwritten or the results are saved into new columns does not matter to me.
My data looks like this:
t3 <- c("2003-2342343","23-23490328","2024-23409")
t4 <- c("13-12","2013-23490","24-23409")
d <- data.frame(t3,t4)
I’m expecting the result to look like this (the 1st and 3rd element in t3 and 2nd element in t4 should change):
t3 <- c("03-2342343","23-23490328","24-23409")
t4 <- c("13-12","13-23490","24-23409")
d <- data.frame(t3,t4)
I’m using a loop to check the columns.
for(i in length(names(d))) {
d[,i] <- if_else((which(strsplit(d[,i], "")[[1]]=="-")) == 5,sub('..', '', d[,i]),d[,i])
}
This is error message:
Error in `if_else()`:
! `true` must have size 1, not size 3.
Run `rlang::last_trace()` to see where the error occurred.
Any ideas on what might be happening here?
I’m using R. Thanks for your help.
>Solution :
Try also this:
library(tidyverse)
d %>%
mutate(across(everything(), ~ str_remove(., "^\\d{2}(?=\\d{2}-)")))
str_remove takes only two arguments: the data and the pattern to remove. Here we define the pattern with a positive look-ahead (?=\\d{2}-), which asserts that the two digits at string-start ^\\d{2} must only be removed iff they are followed by two more digits plus a -