Home How to save result as "ND" when there is no record? rvest and R

Questions

How to save result as "ND" when there is no record? rvest and R

January 23, 2022

I have these two example html: url1.html ; url2.html

In URL1.html there is no information (71) and in URL2.html there is.

I’m using this code in R:

library(rvest)
library(tidyverse)

x<-data.frame(
    URL=c(1:2),
    page=c(paste(readLines("url1.html"), collapse="\n"),
                 paste(readLines("url2.html"), collapse="\n"))
) 

for (i in 1:nrow(x)){
    html<-x$page[i]%>% unclass() %>% unlist()
    read_html(html,encoding = "ISO-8859-1") %>% 
        rvest::html_elements(xpath = '//*[@id="principal"]/table[2]') %>% 
        rvest::html_elements(xpath = '//div[@id="tituloContext"]') %>% 
        html_text()%>%  
        str_replace_all(.,"[\\n\\r\\t]+", "")%>%
        stringr::str_trim( ) -> x$title[i]
}

Result: title

[1] "Â  CARRINHO DE LIXO PARA LIMPEZA URBANA"
character(0)

Problem: although I’m bringing the correct content from URL1, I would like to save the "-" value when it doesn’t exist (e.g. URL2)

Expected output: not available (ND).

[1] "Â  CARRINHO DE LIXO PARA LIMPEZA URBANA"
[1] "ND"

Any idea how to solve this problem?

Is it possible to optimize this code as well?

>Solution :

We could check the length and if it is 0 (length(character(0)) is 0), change the value to ‘ND’

for (i in seq_len(nrow(x))){
    html<-x$page[i]%>% 
                 unclass() %>%
                 unlist()
    read_html(html,encoding = "ISO-8859-1") %>% 
        rvest::html_elements(xpath = '//*[@id="principal"]/table[2]') %>% 
        rvest::html_elements(xpath = '//div[@id="tituloContext"]') %>% 
        html_text()%>%  
        str_replace_all(.,"[\\n\\r\\t]+", "")%>%
        stringr::str_trim( ) -> tmp
      if(length(tmp) == 0) tmp <- "ND"
      x$title[i] <- tmp
}

-checking

> x$title
[1] "CARRINHO DE LIXO PARA LIMPEZA URBANA" "ND"

rvest

byMR

Published January 23, 2022

Add a comment

Simplification of appender function using std::accumulate

byMR

January 23, 2022

Questions

How to write a for/in loop to print the second elements of all the Lists

byMR

January 23, 2022

Questions

Is it possible to check if multiple string indexes match a variable?

byMR

January 23, 2022

Questions

How to read config value using ConfigModule inside of a provider?

byMR

January 23, 2022

Questions

import: command not found

byMR

January 23, 2022

Questions

sort a nested list of dicts by key in python

byMR

January 23, 2022

How to save result as "ND" when there is no record? rvest and R

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Simplification of appender function using std::accumulate

How to write a for/in loop to print the second elements of all the Lists

Is it possible to check if multiple string indexes match a variable?

How to read config value using ConfigModule inside of a provider?

import: command not found

sort a nested list of dicts by key in python

Keep Up to Date with the Most Important News

How to save result as "ND" when there is no record? rvest and R

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Simplification of appender function using std::accumulate

How to write a for/in loop to print the second elements of all the Lists

Is it possible to check if multiple string indexes match a variable?

How to read config value using ConfigModule inside of a provider?

import: command not found

sort a nested list of dicts by key in python

Discover more from Dev solutions