Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Detect strings that have characters after specific character

I am scrapping datasets but specific files are mislabeled and are throwing off the dependent code. What I am trying to do now is filter for relevancy before passing the strings on for further analysis.

All the data comes from USDA. Here are some sample strings

2022_ADMLivestockLrp_Daily_20220617.zip.faerkdb3.jpj 
2022_ADMLivestockLrp_Daily_20220618.zip

What I want is to detect which strings DO NOT have characters AFTER the ".zip". I have been trying to use grepl and stringer with a ".zip*" wild card but cannot figure it out. I am not trying to delete the characters just to detect whether they exist or not. Any help is appreciated.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Here is what I have tried

  url <'https://ftp.rma.usda.gov/pub/references/adm_livestock/2022/')
  href <- read_html(URL)
  href_names <-  as.list(html_attr(html_nodes(href, "a"), "href"))
  href_zip <-  href_names[grepl(".zip*", href_names)]

>Solution :

grep("[.]zip$", href_names, value =TRUE)

[1] "/pub/references/adm_livestock/2022/2022_A00831_ADMDrpDraw_Quarterly_20210701.zip"     
[2] "/pub/references/adm_livestock/2022/2022_A00831_ADMDrpDraw_Quarterly_20210723.zip"     
[3] "/pub/references/adm_livestock/2022/2022_A00831_ADMDrpDraw_Quarterly_20211021.zip"     
[4] "/pub/references/adm_livestock/2022/2022_A00831_ADMDrpDraw_Quarterly_20220125.zip"     
[5] "/pub/references/adm_livestock/2022/2022_A00831_ADMDrpDraw_Quarterly_20220421.zip"     
[6] "/pub/references/adm_livestock/2022/2022_A00832_ADMDrpMilkYield_Quarterly_20210630.zip"
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading