Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

reate a column from another column based on keywords

Based on the data below how can I get add a third Type colummn? The type of hospital will be determined based on certain words in the hospital names.

    Word         Type
    Government   Government
    Govt         Government
    St Jude      Religious
    Catholic     Religious
    District     District
    Community    Community
    Divine Mercy Religious
    St. Luke     Religious
    St. Theresa  Religious
    Islamic      Religious
    Babtist      Religious

Data:

df = structure(list(id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12), 
    Hospital = c("A Government Hospital", "Government B Hospital", 
    "C Govt Hospital", "D St Jude Hospital", "D Catholic Hospital", 
    "Catholic E Hospital", "F District Hospital", "G Community Hospital", 
    "H Divine Mercy Hospital", "I St. Luke Hospital", "J St. Theresa Hospital", 
    "Babtist Hospital")), class = "data.frame", row.names = c(NA, 
-12L))

# Desired df
df_desired =     Hospital = c("A Governtment Hospital", "Goverment B Hospital", 
    "C Govt Hospital", "D St Jude Hospital", "D Catholic Hospital", 
    "Catholic E Hospital", "F District Hospital", "G Community Hospital", 
    "H Divine Mercy Hospital", "I St. Luke Hospital", "J St. Theresa Hospital", 
    "Babtist Hospital"), Type = c("Government", "Government", 
    "Religious", "Religious", "Religious", "Religious", "District", 
    "Community", "Religious", "Religious", "Religious", "Religious"
    )), class = "data.frame", row.names = c(NA, -12L))

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

If we have key/value dataset, can use regex_left_join from fuzzyjoin

library(fuzzyjoin)
library(dplyr)
regex_left_join(df, keydat, by = c("Hospital" = "Word")) %>%   
  select(-Word)

-output

 id                Hospital       Type
1   1  A Governtment Hospital Government
2   2    Goverment B Hospital Government
3   3         C Govt Hospital Government
4   4      D St Jude Hospital  Religious
5   5     D Catholic Hospital  Religious
6   6     Catholic E Hospital  Religious
7   7     F District Hospital   District
8   8    G Community Hospital  Community
9   9 H Divine Mercy Hospital  Religious
10 10     I St. Luke Hospital  Religious
11 11  J St. Theresa Hospital  Religious
12 12        Babtist Hospital  Religious

data

keydat <- structure(list(Word = c("Gover(nt)?ment", "Govt", "St Jude", 
"Catholic", "District", "Community", "Divine Mercy", "St. Luke", 
"St. Theresa", "Islamic", "Babtist"), Type = c("Government", 
"Government", "Religious", "Religious", "District", "Community", 
"Religious", "Religious", "Religious", "Religious", "Religious"
)), row.names = c(NA, -11L), class = "data.frame")
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading