Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to take the output of a print and write it in a file?

I am trying to take the output of the web scrap and put it in a 1 txt file but it gives me an error saying

'charmap' codec can't encode character '\u200a' in position 23130: character maps to <undefined>
  File "C:\Users\Web scrapper.py", line 12, in <module>
    f.write(y)
from urllib.request import urlopen
from bs4 import BeautifulSoup
import pyperclip
x = input("Link you want to scrap from:")
url = x
page = urlopen(url)
html = page.read().decode("utf-8")
soup = BeautifulSoup(html, "html.parser")
y = str(soup.get_text())
print(y)
with open('Dogs.txt', 'w') as f:
    f.write(y)

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Your file is opened with the charmap codec by default. You are trying to write a character to the file which the codec doesn’t support – hence the error. To make sure this doesn’t happen, open the file for writing with the same codec as you decoded the HTML content with. Like this:

from urllib.request import urlopen

x = input("Link you want to scrap from:")
page = urlopen(url)
html = page.read().decode("utf-8")
print(html)
with open('Dogs.txt', 'w', encoding="utf-8") as f:
    f.write(html)

Also, as @Code-Apprentice wrote, there’s no need to use BeautifulSoup here.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading