Trying to convert multiple XLSB files to CSV. Not sure what is the problem here
import os
import pandas as pd
path = r'C://Users//greencolor//Autoreport//Load_attachments//'
for filename in os.listdir(path):
if filename.startswith("PB orders"):
print(filename) #until here its working
month = pd.read_excel(filename, sheet_name="Raw data ", engine="pyxlsb")
print(month) # I get the error here
month = month[month['Sales Manager'] == 'DEVON, JOHN'] #filtering by manager
month.to_csv (path + filename + ".csv", index = None, header=True)
Error
FileNotFoundError: [Errno 2] No such file or directory: ‘PB orders Dec.xlsb’
Why I get this error? print(filename) is printing all the XLSB files that name starts with PB orders
>Solution :
filename is just the file’s name, not the full path. You need to combine it with path to get the full path to the file. You can do that in a safe manner with os.path.join :
import os
...
for filename in os.listdir(path):
if filename.startswith("PB orders"):
full_path = os.path.join(path, filename)
print(full_path )
month = pd.read_excel(full_path , sheet_name="Raw data ", engine="pyxlsb")
Searching with a pattern
An alternative is to use glob to search for files that match a pattern. You still need to generate the full path:
import glob
...
for filename in glob.glob("PB orders*.xlsb", root_dir=path):
full_path = os.path.join(path, filename)
print(full_path )
month = pd.read_excel(full_path , sheet_name="Raw data ", engine="pyxlsb")
Avoiding temp files
You still need to check the file name to avoid the temporary files generated when someone opens an Excel file (the files that start with ~) :
for filename in glob.glob("PB orders*.xlsb", root_dir=path):
if not os.path.basename(filename).startswith("~"):
full_path = os.path.join(path, filename)
print(full_path )
month = pd.read_excel(full_path , sheet_name="Raw data ", engine="pyxlsb")