Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

R Split a title phrase into sub-phrases of a given maximum length

I have an R data frame where the columns have names such as the following:

"Goods excluding food purchased from stores and energy\nLast = 1.8"

"Books and reading material (excluding textbooks)\nLast = 136.1"

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

"Spectator entertainment (excluding video and audio subscription services)\nLast = -13.5"

There are a large number of columns. I want to insert newline characters where necessary, between words, so that these names consist of parts that are no longer than some given maximum, say MaxLen=18. And I want the last part, starting with the word "Last", to be on a separate line. In the three examples, the desired output is:

"Goods excluding\nfood purchased\nfrom stores and\nenergy\nLast = 1.8"

"Books and reading\nmaterial\n(excluding\ntextbooks)\nLast = 136.1"

"Spectator\nentertainment\n(excluding video\nand audio\nsubscription\nservices)\nLast = -13.5"

I have been trying to accomplish this with strsplit(), but without success. The parentheses and ‘=’ sign may be part of my problem. The "\nLast = " portion is the same for all names.

Any suggestions much appreciated.

>Solution :

The strwrap function can help here, though you need to do a bit of work to keep the existing breaks. Consider this option

input <- c("Goods excluding food purchased from stores and energy\nLast = 1.8",
"Books and reading material (excluding textbooks)\nLast = 136.1",
"Spectator entertainment (excluding video and audio subscription services)\nLast = -13.5")

strsplit(input, "\n") |>
  lapply(function(s) unlist(sapply(s, strwrap, 18))) |>
  sapply(paste, collapse="\n")
# [1] "Goods excluding\nfood purchased\nfrom stores and\nenergy\nLast = 1.8"                        
# [2] "Books and reading\nmaterial\n(excluding\ntextbooks)\nLast = 136.1"                           
# [3] "Spectator\nentertainment\n(excluding video\nand audio\nsubscription\nservices)\nLast = -13.5"

Here we split the existing breaks, add new ones, then put it all back together.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading