Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Split string keeping spaces in R

I would like to prepare a table from raw text using readr::read_fwf. There is an argument col_position responsible for determining columns width which in my case could differ.
Table always includes 4 columns and is based on 4 first words from the string like besides one:
category variable description value sth

> text_for_column_width = "category    variable   description      value      sth"
> nchar("category    ")
[1] 12
> nchar("variable   ")
[1] 11
> nchar("description      ")
[1] 17
> nchar("value      ")
[1] 11

I want obtain 4 first words but keeping spaces to have category with 8[a-b]+4[spaces] characters and finally create a vector including number of characters for each of four names c(12,11,17,11). I tried using strsplit with space split argument and then calculate existing zeros however I believe there is faster way just using proper regular expression.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

A possible solution, using stringr:

library(tidyverse)

text_for_column_width = "category    variable   description      value      sth"

strings <- text_for_column_width %>% 
  str_remove("sth$") %>% 
  str_split("(?<=\\s)(?=\\S)") %>% 
  unlist

strings

#> [1] "category    "      "variable   "       "description      "
#> [4] "value      "

strings %>% str_count

#> [1] 12 11 17 11
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading