Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Converting a list as a dictionary value to a cell in pandas

I have a dictionary for a product I have been scraping on this website:
https://www.adamhall.com/shop/gi-en/cables-connectors/pre-assembled-cables/microphone-cables/3323/4-star-mmf-1000
I get the image links as a list into a product dictionary, which I want to import into a DataFrame as a cell value in the column images. However, the output makes the data frame have as many rows as there are image links.

Here is my code so far:

from requests_html import HTMLSession
import pandas as pd


url = 'https://www.adamhall.com/shop/gi-en/cables-connectors/pre-assembled-cables/microphone-cables/3323/4-star-mmf-1000'

# product_properties=

def get_product(url):
  s = HTMLSession()
  r = s.get(url)
  
  images = r.html.find('img.js-zoom-image')
  links=[]
  for image in images:
    link = image.attrs['data-zoom']
    links.append(link)

  product = {
    'id': r.html.find('div.right-item', first=True).text.strip(),
    'title': r.html.find('h1.articlename', first=True).text.strip().replace('\n',' '),
    'description':r.html.find('div.description >p', first=True).text.strip(),
    'details': r.html.find('div.js-accordion__content.specification__content', first=True).text.strip(),
    'image':links,
    
    }
  return product

AHdf=pd.DataFrame(get_product(url))

print(AHdf)

Here is what gets returned:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

              id  ...                                              image
0  K4MMF1000  ...  https://cdn-shop.adamhall.com/ORIGINAL/media/M...
1  K4MMF1000  ...  https://cdn-shop.adamhall.com/ORIGINAL/media/M...

I would like it to have just one row, with all the image links as a list of items, separated by a comma in one cell in the ‘image’ column.

>Solution :

Just enclose your function into a list:

#                   v----------------v
AHdf = pd.DataFrame([get_product(url)])
print(AHdf)

# Output
          id                             title                                        description                                            details                                              image
0  K4MMF1000  Adam Hall Cables 4 STAR MMF 1000  Professional, balanced microphone cable practi...  Cable Length\n10 m\nColor\nBlack\nCable diamet...  [https://cdn-shop.adamhall.com/ORIGINAL/media/...

Another way is to use json_normalize:

AHdf = pd.json_normalize(get_product(url))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading