Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Removing newlines inside <a> tag

I am quite new to python and aslo BeautifulSoup. Recently I try to get all of <a> tag from local HTML,part of my code look like this:

with open(dir_path,encoding="utf-8_sig") as html_file:
     soup =BeautifulSoup(html_file,'html.parser')
tag = soup.find('a')
print(tag)

The output is look like this

<a class="cmp-image__link" data-cmp-hook-image="link" href="/">
<img alt="logo" class="cmp-image__image" data-cmp-hook-image="image" itemprop="contentUrl" src="//imagesource"/>
</a>

What I want to get is a string without newline inside the block of <a> tag

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

<a class="cmp-image__link" data-cmp-hook-image="link" href="/"><img alt="logo" class="cmp-image__image" data-cmp-hook-image="image" itemprop="contentUrl" src="//imagesource"/></a>

I did tried .strip() and replace but it didnt work. Please help!!

>Solution :

.strip() won’t work here, as it only removes leading or trailing whitespace.

.replace() will work here, but you need to assign its return value, as it doesn’t modify the string in-place (because Python strings are immutable).

This:

tag = '''<a class="cmp-image__link" data-cmp-hook-image="link" href="/">
<img alt="logo" class="cmp-image__image" data-cmp-hook-image="image" itemprop="contentUrl" src="//imagesource"/>
</a>'''

result = tag.replace('\n', '')

print(result)

produces the desired output.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading