Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to import a subset of a zip file into colab?

I have a very big zip file in my google drive which contain several subfloders. Now, I’d like to extract only a few subfolders (not all folder into colab). Is there any way for this?

For instance, suppose the zip file name is "MyBigFile.zip" which contain "folder1", "folder2", "folder3", "folder4", and "folder5". I only want to import and extract "folder1",and "folder4" into my google colab (and better import only 200 images from it only). How is it possible? any suggestion?

*if this is related: each folder 1-5 contains around 50000 .png files

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

After some searching I found something. You can use the zipfile module in google collab too.


from zipfile import ZipFile
from google.colab import drive

drive.mount('/content/drive/')

zipfile = ZipFile("quote.zip")
def extract(folderName, numberOfFiles):
    files = list(filter(lambda x: x.startswith(folderName), zipfile.namelist()))[:numberOfFiles]
    for file in files:
        zipfile.extract(file, 'extractedFolder')

extract("folder1/", 200)
extract("folder4/", 100)
zipfile.close()

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading