Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Can't use function in data frame which is converted from Html File

I have one html file where table is stored and I store that html file into pandas Dataframe like this.

from bs4 import BeautifulSoup
import pandas as pd
table = BeautifulSoup(open('/home/lenovo/Downloads/F4311.html','r').read()).find('table')

# You are passing a <class 'bs4.element.Tag'> element into pandas read_html. You need to convert it to a string.
df = pd.read_html(str(table)) 

It worked and i could print df too. Then I tried to list it’s column name.

cols_df=df.columns.tolist()

It threw an error

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

AttributeError: 'list' object has no attribute 'columns'

Then I tried to export to csv file.

df.to_csv("data.csv")

It threw me an error

AttributeError: 'list' object has no attribute 'to_csv'

Please help me in fixing these things.

>Solution :

If you have a look at the documentation for pd.read_html, you will find that it returns not a dataframe, but "[a] list of DataFrames". This explains the error:

AttributeError: 'list' object has no attribute 'columns'

I.e. your actual pd.DataFrame will be the first item in a list that you have called df. I.e. you access it by using df[0].

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading