Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Standardizing the character strings in multiple rows (R or Unix)

I would like to standardize all those _xxxxxx character strings to the xxxxxxH format in V1 column.

V1          V2      V3
122223H     20      Test kits
122224H     23      Test kits
122225H     42      Test kits
122227H     31      Test kits
_122228     23      Test kits
_122229     57      Test kits
_122231     21      Test kits
122232H     33      Test kits
122234H     22      Test kits
.......     ..      .... ....
.......     ..      .... ....
.......     ..      .... ....
122250H     33      Test kits

I tried to solve it with gsub function in R but couldn’t make the exact format that I need. Any kind of suggestions, please!!! Unix based commands are also helpful.

df <- gsub("_","H",c(file$V1))

Outputs;

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

"H1222228" "H1222229" "H1222231"   

Desired outputs;

V1          V2      V3
122223H     20      Test kits
122224H     23      Test kits
122225H     42      Test kits
122227H     31      Test kits
122228H     23      Test kits
122229H     57      Test kits
122231H     21      Test kits
122232H     33      Test kits
122234H     22      Test kits
.......     ..      .... ....
.......     ..      .... ....
.......     ..      .... ....
122250H     33      Test kits

>Solution :

Try the following, though more elegant solutions may exist:

df <- data.frame(v1 = c("122223H","122224H","122225H","122227H","_122228","_122229"),
           v2 = c(21,23,42,31,23,57),
           v3 = rep("Test Kits", times = 6))


df$newstring <- gsub("_","",c(df$v1))
df$newstring <- ifelse(grepl("H", df$newstring, fixed = TRUE), df$newstring, paste0(df$newstring,"H"))


# > df
# v1 v2        v3 newstring
# 1 122223H 21 Test Kits   122223H
# 2 122224H 23 Test Kits   122224H
# 3 122225H 42 Test Kits   122225H
# 4 122227H 31 Test Kits   122227H
# 5 _122228 23 Test Kits   122228H
# 6 _122229 57 Test Kits   122229H
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading