There is a website, in that website there is a ‘download data’ button. Once I click this button, a .csv file will be downloaded. How can I write a Python program to do it? Here is the website: https://climate.weather.gc.ca/climate_data/hourly_data_e.html?hlyRange=2013-06-11%7C2023-05-13&dlyRange=2013-06-13%7C2023-05-12&mlyRange=%7C&StationID=51459&Prov=ON&urlExtension=_e.html&searchType=stnName&optLimit=yearRange&StartYear=2022&EndYear=2023&selRowPerPage=25&Line=3&searchMethod=contains&txtStationName=Toronto&timeframe=1&time=LST&time=LST&Year=2020&Month=5&Day=22#
After doing some research,
I now understand the HTTP have Tree structure and I could use request and Beautifulsoup go go through each layer to find what I need. But I have gone through the entire tree structure and I didn’t find where the csv file is stored. Am I in the right direction? Where should I look into to find where the files are stored?
So I ended up pulling the data shown in the webpage (because I can find them in the tree structure). But I want the csv file in the webpage.
I know there are some similar posts but the ones I checked didn’t work. Can someone give me some advice on weather what I am trying to do is correct?
Thanks.
>Solution :
If you inspect what happens if you click that button, you can drop BS4 and get the csv data via urllib3 or requests by getting:
https://climate.weather.gc.ca/climate_data/bulk_data_e.html?format=csv&stationID=51459&Year=2020&Month=5&Day=22&time=LST&timeframe=1&submit=Download+Data