I have an excel document that has three rows ahead of the main header(name of columns).
Excel document
When loading the data in pandas data frame using :
import pandas
df = pandas.read_excel('output/tracker.xlsx')
print(df)
I get this data(which is fine):
Date/Time:13/06/2022 Unnamed: 1 Unnamed: 2 Unnamed: 3
0 NaN NaN NaN NaN
1 NaN 2763 2763 NaN
2 NaN Site ID Company Site ID Region
3 203990318_700670803 203990318 689179 Nord-Ost
I do not need the first three rows so I run :
df = df.iloc[2:]
It removes the rows that have ID of 0 and 1.
But it doesn’t remove the Date/Time:13/06/2022 Unnamed: 1 etc row.
How do I remove that top row?
>Solution :
Rather directly load the data without the useless rows using the skiprows parameter of pandas.read_excel:
df = pandas.read_excel('output/tracker.xlsx', skiprows=3)