Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Extracting strings by Position in R, preferably the tidyverse

I have a dataset as follows;

My_data <- tibble(ref = 1:3, codes = c(12204, 35478, 67456))

I want to separate the codes column as follows.

The first digit of the codes column forms a new variable clouds.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

The second and third digits of the codes column forms a new variable wind_direction.

The last two digits of the codes column form a new variable wind_speed.

NB: I know that str_match and str_match_all can do this. The problem is that they return a matrix. I want a solution that will extend the tibble to include the three additional variables.

Thank you.

>Solution :

You can use the tidyr::extract function with the appropriate regular expression to do the splitting

My_data %>% 
  mutate(codes = as.character(codes)) %>% 
  extract(codes, c("clouds","wind_direction","wind_speed"), r"{(\d+)(\d{2})(\d{2})}")

#     ref clouds wind_direction wind_speed
#   <int> <chr>  <chr>          <chr>     
# 1     1 1      22             04        
# 2     2 3      54             78        
# 3     3 6      74             56     
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading