how to extract data under a tag from website using bs4

December 24, 2021

<html>

<head>
  <title>Index of /pub/opera/desktop/</title>
</head>

<body>
  <h1>Index of /pub/opera/desktop/</h1>
  <hr>
  <pre><a href="../">../</a>
<a href="15.0.1147.130/">15.0.1147.130/</a>                                     01-Jul-2013 15:18                   -
<a href="15.0.1147.132/">15.0.1147.132/</a>                                     01-Jul-2013 15:18                   -
<a href="15.0.1147.138/">15.0.1147.138/</a>                                     09-Jul-2013 12:11

I need to extract version which is 15.0.1147.130 and date which is 01-Jul-2013 15:18
However, using my code, it only gives me version

soup = BeautifulSoup(requests.get('https://get.geo.opera.com/pub/opera/desktop/').text, 'html.parser')
for item in soup.find('pre').find_all('a')[1:]:
    print(item)

what am I missing to get the date text too?

>Solution :

You get "A" tags, they dont contains Date

    soup = BeautifulSoup(requests.get('https://get.geo.opera.com/pub/opera/desktop/').text, 'html.parser')
    for item in soup.find_all('pre'):
    version = item
    print(version.getText().replace('/', "").replace('-', ""))

UPDADE

import requests
from bs4 import BeautifulSoup
import re


soup = BeautifulSoup(requests.get('https://get.geo.opera.com/pub/opera/desktop/').text, 'html.parser')
lines = soup.find('pre').getText().replace('/', "").replace('-', "").split('\r')

for line in lines[1:-1]:
    my_data = re.sub(' +', ' ', line).split(' ')
    geo = my_data[0]
    date = my_data[1]
    time = my_data[2]
    print(geo, date, time)