Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Make a general working code with tables from Wiki

I am one the quest to make a general function for the Premier League table with data from Wiki. Down below is how the desired function should look like where read_prem_league(2020) will show the table from the desired year as a tibble.

"https://en.wikipedia.org/wiki/20XX–XX_Premier_League"

read_prem_league <- function(season){ }

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

read_prem_league(2020)

I have this function below that works for a single year. So what I’m trying to figure out is how to make the XX in the url to a "key" down below and then rename it so e.g. read_prem_league(2019) will generate table for the season ending in 2019.

url <- "https://en.wikipedia.org/wiki/2018–19_Premier_League"
page <- read_html(url)
premierleague <- html_table(prem[[5]])
prem <- html_elements(page, css = "table")

premierleague

>Solution :

You could use paste0 and substr to build the url for a given year:

library(rvest)

read_prem_league <- function(year) {
  
"https://en.wikipedia.org/wiki/" %>%
  paste0(year - 1, "-", substr(as.character(year), 3, 4), "_Premier_League") %>%
  read_html() %>% 
  html_table() %>% 
  getElement(5)
}

read_prem_league(2021)
#> # A tibble: 20 x 11
#>      Pos Team                   Pld     W     D     L    GF    GA GD      Pts
#>    <int> <chr>                <int> <int> <int> <int> <int> <int> <chr> <int>
#>  1     1 Manchester City (C)     38    27     5     6    83    32 +51      86
#>  2     2 Manchester United       38    21    11     6    73    44 +29      74
#>  3     3 Liverpool               38    20     9     9    68    42 +26      69
#>  4     4 Chelsea                 38    19    10     9    58    36 +22      67
#>  5     5 Leicester City          38    20     6    12    68    50 +18      66
#>  6     6 West Ham United         38    19     8    11    62    47 +15      65
#>  7     7 Tottenham Hotspur       38    18     8    12    68    45 +23      62
#>  8     8 Arsenal                 38    18     7    13    55    39 +16      61
#>  9     9 Leeds United            38    18     5    15    62    54 +8       59
#> 10    10 Everton                 38    17     8    13    47    48 -1       59
#> 11    11 Aston Villa             38    16     7    15    55    46 +9       55
#> 12    12 Newcastle United        38    12     9    17    46    62 -16      45
#> 13    13 Wolverhampton Wande~    38    12     9    17    36    52 -16      45
#> 14    14 Crystal Palace          38    12     8    18    41    66 -25      44
#> 15    15 Southampton             38    12     7    19    47    68 -21      43
#> 16    16 Brighton & Hove Alb~    38     9    14    15    40    46 -6       41
#> 17    17 Burnley                 38    10     9    19    33    55 -22      39
#> 18    18 Fulham (R)              38     5    13    20    27    53 -26      28
#> 19    19 West Bromwich Albio~    38     5    11    22    35    76 -41      26
#> 20    20 Sheffield United (R)    38     7     2    29    20    63 -43      23
#> # ... with 1 more variable: `Qualification or relegation` <chr>

Created on 2022-07-30 by the reprex package (v2.0.1)

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading