Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas – How To Clean Up Scrape

My goal is to access a clinical trials page, and pull the last row of a given table.

My current code, when pulling this last row, pulls more information than needed. (See attached)

I would like for only the date to pull (Highlighted in green).

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

import pandas as pd
import time
from selenium import webdriver
driver = webdriver.Chrome()

url='https://clinicaltrials.gov/ct2/show/NCT03328858?cond=brain+tumor&draw=2&rank=4'
driver.get(url)
time.sleep(1)

df=pd.read_html(url)[3] 
df3=df.iloc[-1]
print(df3)

enter image description here

>Solution :

So if you like to get the last value of last series you can use .iloc[] method this way:

df.iloc[-1,-1]

or by series name if you will know it or sure it will be ‘Unnamed: 1’:

df['Unnamed: 1'].iloc[-1]

Will give you:

January 31, 2020
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading