Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Cannot identify what this "2" is in pandas dataframe header?

I am writing a script to format data from an excel sheet template I frequently use so that I can work with it without having to manually format it each time. I am using the following code to remove some useless header rows that appear and make the third row the actual header.

new_header = df.iloc[2] #grab the third row for the header
df = df[3:] #take the data below the new header row
df.columns = new_header #set the header row as the df header
df.reset_index(drop=True, inplace=True)

This works great except when I view the dataframe there is a 2 above my index. This does not appear to be the index name or a column name (I have checked both) and there does not appear to be multiindexing present. This seems rather simple but I am stumped as to what this 2 is and how I can remove it.

Image of the DataFrame

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Any help would be appreciated.

>Solution :

Check the result of your new_header when you pull the third row df.iloc[2]. You will notice that it has the index 2 in the output. That is where it comes from. You can git rid of it by changing the first line to new_header = df.iloc[2].to_list()

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading