How is Python's sys.getdefaultencoding() used?

February 14, 2022

Python’s default encoding got me confused.

There is an á character in a text file’s content.
The file is saved as UTF-8 in notepad.
When I don’t specify encoding=’utf-8′ in:

with open(filename,encoding='utf-8') as f:
    for line in f:
        print(line)

it shows up as Ã¡.
When I do add the encoding=’utf-8′ part it shows up as á.

I am wondering what sys.getdefaultencoding() is useful for, as this shows utf-8, but I still had to specify utf-8 as encoding for the á to show up in the output.

I’m using Python3.

Extra edit:

The encoding that is used is probably latin-1 extended I think. Since:
á in utf-8 maps to 0xC3 0xA1 and in latin-1 extended: 0xC3 maps to Ã 0xA1 maps to ¡

How could I verify that latin-1 extended will be used when not specifying encoding?

>Solution :

Read the docs in Built-in Functions -> open():

open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)

…
In text mode, if encoding is not specified the encoding used
is platform dependent: locale.getpreferredencoding(False) is
called to get the current locale encoding.
…

where locale.getpreferredencoding(do_setlocale=True)