Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Webscraping with R For Loop of Multiple Pages

I am trying to web scrape the location of real estate in Vienna, for one page it is working but for multiple does not:

library(rvest)
library(dplyr)

link <- "https://www.immobilienscout24.at/regional/wien/wien/immobilie-kaufen/seite-4"
page <- read_html(link)

location <- page %>% html_elements(".YqNih") %>% html_text()

flat <- data.frame(location, stringsAsFactors = FALSE)

However, the for loop does not return pages as it should:

library(rvest)
library(dplyr)

flat_II = data.frame()

for (i in 2:20) {
  link <- paste0("https://www.immobilienscout24.at/regional/wien/wien/immobilie-kaufen/seite-", i)
  page <- read_html(link)
  
  location <- page %>% html_element(".YqNih") %>% html_text()
  
  flat_II = rbind(flat_II, data.frame(location, stringsAsFactors = FALSE))
  print(paste("Page:", i))

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

It seems the ID changes from YqNih to gTYeB at page 6. I didn’t check any further. If you want all of the addresses on the page, you may want to try this line:

  location <- page %>% html_nodes("address") %>% html_text()
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading