Pandas read_excel parameter sheet_name

Advertisements

I am building a pipeline that unfortunately requires a data hand off from another team. We have found that the sheet name for a particular piece of data suffers from formatting issues. The sheet is supposed to be named by the month corresponding to the data in all lowercase. However we have received the file multiple times now with all uppercase and mixed case. I believe that this file is generated manually so the sheet is not always in the same position (most of the time it is the first sheet but occasionally it is second). Is there any way to programmatically use Pandas read_excel function to read a sheet name in a case insensitive way?

>Solution :

You can use ExcelFile:

sheet_name = 'data2'  # what you are looking for

with pd.ExcelFile('data.xlsx') as xlsx:
    sheets = [sheet.lower() for sheet in xlsx.sheet_names]
    # If the sheet is not found, it will a ValueError exception
    idx = sheets.index(sheet_name)
    df = pd.read_excel(xlsx, sheet_name=idx)

Details:

>>> xlsx.sheet_names
['Data1', 'DATA2', 'data3']

>>> sheets
['data1', 'data2', 'data3']

>>> sheets.index(sheet_name)
1

Leave a ReplyCancel reply