I have a string as such:
"05/05/2005 ANNIVERSARY $367.62 ANNUAL DIVIDEND DECLARED UNDER THE PAIO UP ADDITIONS 20,965 2,203 23,168 | PAID UP ADDITION OPTION. $367.62 PURCHASED PAID UP ADDITIONS OF 2,203 02/15/2006 WITHDRAWAL ($77.50) VALUE OF PAID UP ADDITIONS OF 464 PAID UP ADDITIONS 23,168 (464) 22,704 APPLIED TOWARDS CHECK-O-MATIC PREMIUM DUE 03/05/2006 04/11/2006 05/05/2006 ANNIVERSARY $415.70"
I would like to create a data frame in R to extract the date and dollar amounts after the word ANNIVERSARY for the entire string.
Date Dividend
05/05/2005 $367.62
05/05/2006 $415.70
Thank you in advance.
I tried splitting the string with str_split but don’t know where to go from there.
>Solution :
If we just want to extract the dollar amounts and date, we may use str_extract with a regex lookaround (or in the new version with capture group)
library(stringr)
library(tibble)
dates <- str_extract_all(str1, "\\d{2}/\\d{2}/\\d{4}(?=\\s+ANNIVERSARY)")[[1]]
amounts <- str_extract_all(str1, "(?<=ANNIVERSARY )\\$[0-9.]+")[[1]]
tibble(dates, amounts)
# A tibble: 2 × 2
dates amounts
<chr> <chr>
1 05/05/2005 $367.62
2 05/05/2006 $415.70
data
str1 <- "05/05/2005 ANNIVERSARY $367.62 ANNUAL DIVIDEND DECLARED UNDER THE PAIO UP ADDITIONS 20,965 2,203 23,168 | PAID UP ADDITION OPTION. $367.62 PURCHASED PAID UP ADDITIONS OF 2,203 02/15/2006 WITHDRAWAL ($77.50) VALUE OF PAID UP ADDITIONS OF 464 PAID UP ADDITIONS 23,168 (464) 22,704 APPLIED TOWARDS CHECK-O-MATIC PREMIUM DUE 03/05/2006 04/11/2006 05/05/2006 ANNIVERSARY $415.70"