Fixing foreign/special characters

I have a list of names with special characters and I wish to revert them back to the actual names, in this case I want to get from the below examble of;

Bulcsú Révész

to the actual name of

Bulcsu Revesz

I have a few names like this, not fussy if the accent are in the name or not

>Solution :

You can use xml2 to recover the name from the HTML entity code:

# input string 
input_str <- "Bulcs&#250; R&#233;v&#233;sz"

# convert
xml2::xml_text(xml2::read_html(charToRaw(input_str)))
# [1] "Bulcsú Révész"

# If there are multiple names to be converted 
input_str_vec <- c("Bulcs&#250; R&#233;v&#233;sz", "M&eacute;lissa", "Fran&ccedil;ois")

# sapply over the vector of encoded names
sapply(input_str_vec, \(str){
  
  # convert
  xml2::xml_text(xml2::read_html(charToRaw(str)))
                 
  })

# Bulcs&#250; R&#233;v&#233;sz               M&eacute;lissa               Fran&ccedil;ois 
#              "Bulcsú Révész"                    "Mélissa"                   "François"      

Leave a Reply