Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to replace specific values in a column with other values?

I have a data frame and want to add a new column to it based on another column and then replace its values.

For example column ID_old is what I have:

df1 <- structure(list(ID.old=c(1,1,1,  2,2,  3,3,3,3,  4,4,  5,5,5,5,5,  6,6,6,  7,7,7,7,  8,8,  9, 10,10,10, 11,11,  12,12,12, 13,13,  14,14,14,14, 15,15,  16, 17,17, 18, 19,19,19, 20,20,20)),
                 class = "data.frame", row.names = c(NA,-52L))

and now column ID_new is what I need:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df2 <- structure(list(ID.old=c(1,1,1,  2,2,  3,3,3,3,  4,4,  5,5,5,5,5,  6,6,6,  7,7,7,7,  8,8,  9, 10,10,10, 11,11,  12,12,12, 13,13,  14,14,14,14, 15,15,  16, 17,17, 18, 19,19,19, 20,20,20),
                      ID.new=c('a1','a1','a1', 'a2','a2', 'a3','a3','a3','a3', 'a4','a4', 'a5','a5','a5','a5','a5', 'a1','a1','a1', 'a2','a2','a2','a2', 'a3','a3', 'a4', 'a5','a5','a5', 'a1','a1', 'a2','a2','a2', 'a3','a3', 'a4','a4','a4','a4', 'a5','a5', 'a1', 'a2','a2', 'a3', 'a4','a4','a4', 'a5','a5','a5')),
                 class = "data.frame", row.names = c(NA,-52L))

I thought that I can use str_replace_all from stringer, but it produces something different,

library(stringr)
df1<- df1 %>% 
  mutate(ID.new = ID.old)
replace = c("1"="a1", "2"="a2", "3"="a3", "4"="a4", "5"="a5",
            "6"="a1", "7"="a2", "8"="a3", "9"="a4", "10"="a5",
            "11"="a1", "12"="a2", "13"="a3", "14"="a4", "15"="a5",
            "16"="a1", "17"="a2", "18"="a3", "19"="a4", "20"="a5")

df1$ID.new<- str_replace_all(df1$ID.new, replace)

in my original data frame, I have many rows, and specifically, I need wherever it is 1,6,11,16 to be "a1".

2,7,12,17 to be "a2" etc.

How can I get a column like what we have in df2 ID.new
Thanks

>Solution :

stringr::str_replace_all is based on regex. For example, with your ‘replace’ dictionnary, it replaces every 1 it encounters with "a1", so the number ’11’ is replaced by "a1a1", as it contains two successive 1. Since you have already designed a dictionary, you should simply add ‘start’ (^) and end ($) regex tags, as I suggest below:

  1. Simply add this line of code after the creation of your actual ‘replace’ dictionnary:

names(replace) = paste0("^", names(replace), "$")

  1. And know the replacement is correct if you proceed again df1$ID.new<- str_replace_all(df1$ID.new, replace)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading