Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How can I scrape a table from the website in the question

I am trying to copy a table from a webpage, there are going to be many as I am trying to get the versions of the data for each dataset, I am trying to get at least one table but failing. Scraping is not my thing, maybe it is obvious how to get it but not to me.

Here is my code:

url <- "https://data.cms.gov/provider-characteristics/medicare-provider-supplier-enrollment/medicare-fee-for-service-public-provider-enrollment/api-docs"

html <- rvest::read_html(url)
> html |> rvest::html_node(".table")
{xml_missing}
<NA>

And

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

> html |>
 rvest::html_node(xpath = "/html/body/div/div/div/div/div/div/div[2]/div[2]/div/div/table/tbody")
{xml_missing}
<NA>

And

html |>
  rvest::html_node("tbody")

>Solution :

Unfortunately this approach is not going to work. The tables in the page you’re looking at are generated via JavaScript. The rvest::read_html(url) call will retrieve the static content on that page but will not execute any (dynamic) JavaScript.

But there is an API behind the site, so you can get the data directly from that. For example:

library(httr)

params = list(
  `path` = "/provider-characteristics/medicare-provider-supplier-enrollment/medicare-fee-for-service-public-provider-enrollment"
)

res <- httr::GET(url = "https://data.cms.gov/data-api/v1/slug", query = params)

cat(content(res, as="text", encoding = "UTF-8"))

Alternatively you can use something like {RSelenium} to evaluate the JavaScript and then scrape the fully rendered page.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading