Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

If-statement in RSelenium

I have a vast list of chemicals for that I need to extract the CAS number. I have written a for loop which works as intended. However, when a chemical name is not found on the website, my code obviously stops.

Is there a way to account for this in the for loop? So that when a search query is not found, the loop goes back to the start page and searches for the next item in the list?

Down below is my code for the for loop with a short list of names to search for:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

library(RSelenium)
library(netstat)

# start the server

rs_driver_object <- rsDriver(browser = "firefox",
                             verbose = FALSE,
                             port = 4847L) # change number if port is not open

# create a client object
remDrCh <- rs_driver_object$client

items <- c("MCPA", "DEET", "apple")
numbers <- list()
for (i in items) {
  Sys.sleep(2)
  remDrCh$navigate("https://commonchemistry.cas.org/")
  search_box <- remDrCh$findElement(using = 'class', 'search-input')
  search_box$sendKeysToElement(list(paste(i), key = 'enter'))
  Sys.sleep(2)
  result <- remDrCh$findElement(using = "class", "result-content")
  result$clickElement()
  Sys.sleep(2)
  cas <- remDrCh$findElements(using = 'class', 'cas-registry-number')
  cas_n <- lapply(cas, function (x) x$getElementText()) 
  numbers[[i]] <- unlist(cas_n)
  Sys.sleep(2)
  remDrCh$navigate("https://commonchemistry.cas.org/")
  Sys.sleep(2)
}

The problem lies in the result <- remDrCh$findElement(using = "class", "result-content") part. For "apple" there is no result, and thus no element that R could use.

I tried to write a separate if else argument for that specific part, but to no avail.
This still only works for queries that yield a result. I also tried to use findElements but this only helps for the case when no result is found.

result <- remDrCh$findElement(using = "class", "result-content")
if (length(result) > 0) {
  result$clickElement()
} else {
  remDrCh$navigate("https://commonchemistry.cas.org/")
}

I also tried to use this How to check if an object is visible in a webpage by using its xpath? but I cannot get it to work on my example.

Any help would be much appreciated!

>Solution :

This should work

items <- c("MCPA", "apple", "DEET")
numbers <- list()
for (i in items) {
  Sys.sleep(2)
  remDrCh$navigate("https://commonchemistry.cas.org/")
  search_box <- remDrCh$findElement(using = 'class', 'search-input')
  search_box$sendKeysToElement(list(paste(i), key = 'enter'))
  Sys.sleep(2)
  result <- try(remDrCh$findElement(using = "class", "result-content"))
  if(!inherits(result, "try-error")){
  result$clickElement()
  Sys.sleep(2)
  cas <- remDrCh$findElements(using = 'class', 'cas-registry-number')
  cas_n <- lapply(cas, function (x) x$getElementText()) 
  numbers[[i]] <- unlist(cas_n)
  }else{
    numbers[[i]] <- NA
  }
  Sys.sleep(2)
  remDrCh$navigate("https://commonchemistry.cas.org/")
  Sys.sleep(2)
}

Note the try() wrapper around the problematic code:

  result <- try(remDrCh$findElement(using = "class", "result-content"))

This will capture the error if there is one, but allow the loop to continue. Then, there is an if statement that tries to find the result if the output from try is not of class "try-error" otherwise, it returns the number as NA.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading