Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Filter file extension in Pandas

I want to filter a specific file extension (.xlsx) from the file names in Pandas stored in the same folder. I only want to apply the rest of the script to the files with that extension.

The code that I have created to filter those files is the following one:

path = os.getcwd()
files = os.listdir(path)
df_file = pd.DataFrame(files, columns=['Filename'])
df_file

for files in df_file['Filename']:
    if ".xlsx" in files:
        t1_list = df_file["Filename"].str.split(' ')
        print(t1_list)

Basically, it reads the filenames and store them in a dataframe (df_file). Then, I try to filter by ".xlsx" and store this on a list of lists (t1_list).

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

But the output that I get is this one:

enter image description here

As you can see, it’s not filtering anything. What am I doing wrong?

Thanks

>Solution :

This sounds like task for glob.glob, in this case you might replace

path = os.getcwd()
files = os.listdir(path)

using

import glob
files = glob.glob("*.xlsx")
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading