Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How do you get the col number for an entry/entries in Pandas?

As per the title, I’m trying to get the col number (0-30) for an entry. I realise that this is easy to do the other way round (get the entry for col number x) but don’t think this will suit my needs.

The point of this exercise is to download images and rename them as per the SKU + the position in the dataframe (i.e. ‘PRODUCT-A_01.jpg’, ‘PRODUCT-A_02.jpg’, etc).

Currently, this code is working but (obviously) overwrites the same image x times per row instead of creating x images per row. This is what I need help with, adding in the column number so each file has a unique name. I’ve marked where I think this should go in the code with a ‘##’.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

import os
import re
import requests
import pandas as pd
os.chdir(r'C:\Users\Bot\Python Directory')
df = pd.read_csv('images.csv')

df = df.fillna('')
finalrow = len(df.index)

for f in range(0,finalrow):
    for i in df.iloc[f]:
        if 'https://' in i:
            image = requests.get(i, allow_redirects=True)
            open(df.SKU.iloc[f] + '_' + ## + '.jpg', 'wb').write(image.content)
        else:
            pass

And this is images.csv

SKU         Image 1                             Image 2                             Image 3
PRODUCT-A   https://i.imgur.com/W8Jsnst.png     https://i.imgur.com/hr3XhXd.png  https://i.imgur.com/KDGp4je.jpeg
PRODUCT-B   https://i.imgur.com/gqloImo.jpeg    https://i.imgur.com/0iaSoA3.jpeg    https://i.imgur.com/WNQwHRd.jpeg
PRODUCT-C   https://i.imgur.com/Idnnqxp.jpeg    https://i.imgur.com/j8KEnNn.jpeg    https://i.imgur.com/nAsRtqX.jpeg

>Solution :

You can enumerate:

for f in range(0,finalrow):
    for i, url in enumerate(df.iloc[f]):
        if 'https://' in i:
            image = requests.get(url, allow_redirects=True)
            
            out_filename = f'{df.SKU.iloc[f]}_{i}.jpg'
            open(out_filename, 'wb').write(image.content)
        else:
            pass

Or, for dataframes, you can use .loc with column name and index, like this:

for idx in df.index:
    # common sku for the row
    sku = df.loc[idx, 'SKU']

    # exclude SKU column
    for col in df.columns[1:]:
        url = df.loc[idx, col]
        if 'https://' in url:
            image = requests.get(url, allow_redirects=True)

            out_filename = f'{sku}_{col}.jpg'
            open(out_filename, 'wb').write(image.content)
        else:
            pass
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading