How do you get the col number for an entry/entries in Pandas?

March 30, 2023

As per the title, I’m trying to get the col number (0-30) for an entry. I realise that this is easy to do the other way round (get the entry for col number x) but don’t think this will suit my needs.

The point of this exercise is to download images and rename them as per the SKU + the position in the dataframe (i.e. ‘PRODUCT-A_01.jpg’, ‘PRODUCT-A_02.jpg’, etc).

Currently, this code is working but (obviously) overwrites the same image x times per row instead of creating x images per row. This is what I need help with, adding in the column number so each file has a unique name. I’ve marked where I think this should go in the code with a ‘##’.

import os
import re
import requests
import pandas as pd
os.chdir(r'C:\Users\Bot\Python Directory')
df = pd.read_csv('images.csv')

df = df.fillna('')
finalrow = len(df.index)

for f in range(0,finalrow):
    for i in df.iloc[f]:
        if 'https://' in i:
            image = requests.get(i, allow_redirects=True)
            open(df.SKU.iloc[f] + '_' + ## + '.jpg', 'wb').write(image.content)
        else:
            pass

And this is images.csv

SKU         Image 1                             Image 2                             Image 3
PRODUCT-A   https://i.imgur.com/W8Jsnst.png     https://i.imgur.com/hr3XhXd.png  https://i.imgur.com/KDGp4je.jpeg
PRODUCT-B   https://i.imgur.com/gqloImo.jpeg    https://i.imgur.com/0iaSoA3.jpeg    https://i.imgur.com/WNQwHRd.jpeg
PRODUCT-C   https://i.imgur.com/Idnnqxp.jpeg    https://i.imgur.com/j8KEnNn.jpeg    https://i.imgur.com/nAsRtqX.jpeg

>Solution :

You can enumerate:

for f in range(0,finalrow):
    for i, url in enumerate(df.iloc[f]):
        if 'https://' in i:
            image = requests.get(url, allow_redirects=True)
            
            out_filename = f'{df.SKU.iloc[f]}_{i}.jpg'
            open(out_filename, 'wb').write(image.content)
        else:
            pass

Or, for dataframes, you can use .loc with column name and index, like this:

for idx in df.index:
    # common sku for the row
    sku = df.loc[idx, 'SKU']

    # exclude SKU column
    for col in df.columns[1:]:
        url = df.loc[idx, col]
        if 'https://' in url:
            image = requests.get(url, allow_redirects=True)

            out_filename = f'{sku}_{col}.jpg'
            open(out_filename, 'wb').write(image.content)
        else:
            pass