Let’s say I have the following dataset. And, I want to change the range of values starting from 20010001-20010010 to 2001-2010.
How can I do this?
Sample data (df):
structure(list(x = c(20010001, 20010001, 20010002, 20010002,
20010003, 20010003, 20010004, 20010004, 20010005, 20010005, 20010006,
20010006, 20010007, 20010007, 20010008, 20010008, 20010009, 20010009,
200100010, 200100010, 20, 2, 19, 18, 17, 16, 15, 14965, 14964
), y = c("2001", "ORIG", "2001", "ORIG", "2001", "ORIG", "2001",
"ORIG", "2001", "ORIG", "2001", "ORIG", "2001", "ORIG", "2001",
"ORIG", "2001", "ORIG", "2001", "ORIG", "2020", "2020", "2020",
"2020", "2020", "2020", "2020", "2022", "2022")), class = "data.frame", row.names = c(NA, -29L))
Code:
library(tidyverse)
# To change a single value at a time
df["1", "x"] = 2010
# Now how to do it for a range of values wihtout having to do it one by one?
>Solution :
Another possible solution.
EXPLANATION
library(tidyverse)
df %>%
mutate(z = str_replace(x, "2001[0]+(?=\\d{2}$)", "20"))
#> x y z
#> 1 20010001 2001 2001
#> 2 20010001 ORIG 2001
#> 3 20010002 2001 2002
#> 4 20010002 ORIG 2002
#> 5 20010003 2001 2003
#> 6 20010003 ORIG 2003
#> 7 20010004 2001 2004
#> 8 20010004 ORIG 2004
#> 9 20010005 2001 2005
#> 10 20010005 ORIG 2005
#> 11 20010006 2001 2006
#> 12 20010006 ORIG 2006
#> 13 20010007 2001 2007
#> 14 20010007 ORIG 2007
#> 15 20010008 2001 2008
#> 16 20010008 ORIG 2008
#> 17 20010009 2001 2009
#> 18 20010009 ORIG 2009
#> 19 200100010 2001 2010
#> 20 200100010 ORIG 2010
#> 21 20 2020 20
#> 22 2 2020 2
#> 23 19 2020 19
#> 24 18 2020 18
#> 25 17 2020 17
#> 26 16 2020 16
#> 27 15 2020 15
#> 28 14965 2022 14965
#> 29 14964 2022 14964