Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Extract from a Dataframe Column a string that the pattern matches in a vector of strings

I have this dataset of columns, one is basically a quote and the Name of the state, below is an example:
`

library(tidyverse)
df <- tibble(num = c(11,12,13), quote = c("In Ohio, there are plenty of hobos","Georgia, where the peaches are peachy","Oregon, no, we did not die of dysentery"))

I want to create a column that extracts the specific state.

Here’s what I tried:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

states <- state.name
df <- df %>% mutate(state = na.omit(as.vector(str_match(quote,states)))[[1]])

Which fetches this error:

Error in `mutate()`:
ℹ In argument: `state = na.omit(as.vector(str_match(quote, states)))[[1]]`.
Caused by error in `str_match()`:
! Can't recycle `string` (size 3) to match `pattern` (size 50).

>Solution :

You need to collapse the state names in one string and then use str_extract to extract the name from it.

library(dplyr)
library(stringr)

df %>% 
  mutate(state = str_extract(quote,str_c(state.name, collapse = "|")))

#    num quote                                   state  
#  <dbl> <chr>                                   <chr>  
#1    11 In Ohio, there are plenty of hobos      Ohio   
#2    12 Georgia, where the peaches are peachy   Georgia
#3    13 Oregon, no, we did not die of dysentery Oregon 

where str_c generates this string.

str_c(state.name, collapse = "|")
[1] "Alabama|Alaska|Arizona|Arkansas|California|Colorado|Connecticut|Delaware|Florida|Georgia|Hawaii|Idaho|Illinois|Indiana|Iowa|Kansas|Kentucky|Louisiana|Maine|Maryland|Massachusetts|Michigan|Minnesota|Mississippi|Missouri|Montana|Nebraska|Nevada|New Hampshire|New Jersey|New Mexico|New York|North Carolina|North Dakota|Ohio|Oklahoma|Oregon|Pennsylvania|Rhode Island|South Carolina|South Dakota|Tennessee|Texas|Utah|Vermont|Virginia|Washington|West Virginia|Wisconsin|Wyoming"
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading