Webscraping with R For Loop of Multiple Pages

I am trying to web scrape the location of real estate in Vienna, for one page it is working but for multiple does not:

library(rvest)
library(dplyr)

link <- "https://www.immobilienscout24.at/regional/wien/wien/immobilie-kaufen/seite-4"
page <- read_html(link)

location <- page %>% html_elements(".YqNih") %>% html_text()

flat <- data.frame(location, stringsAsFactors = FALSE)

However, the for loop does not return pages as it should:

library(rvest)
library(dplyr)

flat_II = data.frame()

for (i in 2:20) {
  link <- paste0("https://www.immobilienscout24.at/regional/wien/wien/immobilie-kaufen/seite-", i)
  page <- read_html(link)
  
  location <- page %>% html_element(".YqNih") %>% html_text()
  
  flat_II = rbind(flat_II, data.frame(location, stringsAsFactors = FALSE))
  print(paste("Page:", i))

>Solution :

It seems the ID changes from YqNih to gTYeB at page 6. I didn’t check any further. If you want all of the addresses on the page, you may want to try this line:

  location <- page %>% html_nodes("address") %>% html_text()

Leave a Reply