Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How do I count icons using rvest?

I want to count the number of overall stars for each player on this page: https://cbgm.news/stats/CONN_Ratings.html

Here’s my rvest code:

library(tidyverse)
library(rvest)

url <- "https://cbgm.news/stats/CONN_Ratings.html"

scrape <- url %>% 
  read_html() %>% 
  html_nodes("td:nth-child(19)")

scrape

This returns:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

{xml_nodeset (14)}
 [1] <td>\n<i class="star yellow icon"></i><i class="star yellow ic ...
 [2] <td>\n<i class="star yellow icon"></i><i class="star yellow ic ...
 [3] <td>\n<i class="star yellow icon"></i><i class="star yellow ic ...
 [4] <td>\n<i class="star yellow icon"></i><i class="star yellow ic ...
 [5] <td>\n<i class="star yellow icon"></i><i class="star yellow ic ...
 [6] <td>\n<i class="star yellow icon"></i><i class="star yellow ic ...
 [7] <td>\n<i class="star yellow icon"></i><i class="star yellow ic ...
 [8] <td>\n<i class="star yellow icon"></i><i class="star yellow ic ...
 [9] <td>\n<i class="star yellow icon"></i><i class="star yellow ic ...
[10] <td>\n<i class="star yellow icon"></i><i class="star yellow ic ...
[11] <td>\n<i class="star yellow icon"></i><i class="star yellow ic ...
[12] <td>\n<i class="star yellow icon"></i><i class="star yellow ic ...
[13] <td>\n<i class="star yellow icon"></i><i class="star yellow ic ...
[14] <td><i class="star half yellow icon"></i></td>\n

How do I convert the xml_nodeset to a df/tibble that allows for mutating and counting the number of star icons?

I appreciate any help with this puzzle!

>Solution :

You could make a small function that looks for stars (full and half) and returns the number. Then use mutate() to add a new column stars which holds the application of that function to each element of scrape.

f <- function(s) {
  return(str_count(as.character(s), "star yellow") + str_count(as.character(s), "star half")/2)
}

Now, use rvest::html_table() along with mutate()

rvest::html_table(url %>% read_html)[[1]] %>% 
  mutate(OVERALL = sapply(scrape,f))

Output:

     NUM POS   PLAYER    FGI   FGJ    FT   SCR   PAS   HDL   ORB   DRB   DEF   BLK   STL  DRFL    DI    IQ   ATH OVERALL
   <int> <chr> <chr>   <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>   <dbl>
 1     1 PG    Marek …    36    76    54    73    81    50    66    72    65    72    75    21    53    42    58     4  
 2    14 PG    Brian …    10    90    72    50    74    32    71    71    69    53    82    32    65    57    91     3.5
 3    15 PG    Morris…    25    85    56    71    53    60    10    53    76    10    53    28    72    47    76     2  
 4    12 SG    Ryan M…    31    78    96    74    46    38    50    43    71    46    40    35    61    45    75     3.5
 5    21 SG    Lenny …    10    90    67    50    60    49    56    71    58    60    66    39    56    38    69     3  
 6     5 SG    Fred M…    10    83    61    71    30    23    10    78    63    10    16    39    61    38    87     2  
 7    23 SF    Will B…    35    73    58    74    66    38    70    72    52    60    74    21    42    46    30     4  
 8    51 SF    Lyly L…    51    76    83    84    75    32    66    81    61    70    85    24    52    47    60     5  
 9    42 SF    Joe Ch…    58    50    80    70    56    39    53    53    78    10    53    21    54    52    78     2  
10    40 PF    Richar…    63    50    41    72    71    32    79    78    65    71    71    54    39    43    72     4  
11    30 PF    Ammer …    56    54    81    63    60    23    72    72    78    66    56    35    50    54    58     3.5
12    54 C     Xavier…   100    33    36   100    76    16    96    91    76   100    87    61    28    41    73     5  
13    45 C     Brad L…    91    38    56    60    63    19    75    76    78    82    70    58    30    28    68     4  
14    10 C     Ed Str…    68    40    45    10    10    10    13    10    10    10    10    24    17    16    10     0.5
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading