Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Parse a table from wikipedia that is hidden

I’am pretty new here.
I want to parse a table from wikipedia from a following link:
https://en.wikipedia.org/wiki/MIUI

I was able to parse first table, but I can’t figure out how to get the information from the second table there, the information that contains "version history" of MIUI browser.
Is there a way to parse it into a datafram? to get the version of browser in one column and Date of Release in the second column?

Thank you!

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

so far I have tried this, but it doesn’t return anything:
Can’t get the tabl’s information.

url = 'https://en.wikipedia.org/wiki/MIUI'
response = requests.get(url)

soup = BeautifulSoup(response.content, 'html.parser')

table = soup.find('table', class_='wikitable sortable mw-collapsible mw-no-collapsible jquery-tablesorter mw-made-collapsible') 

>Solution :

Table is not hidden, but missing some dynamic class names, so you have to select more specififc based on in response available data.


You could read the table directly via pandas.read_html() picking it by index:

import pandas as pd
pd.read_html('https://en.wikipedia.org/wiki/MIUI',displayed_only=False)[2]

or matching content match='Version':

import pandas as pd
pd.read_html('https://en.wikipedia.org/wiki/MIUI',displayed_only=False, match='Version')[0]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading